Question Answering On Medqa Usmle
평가 지표
Accuracy
평가 결과
이 벤치마크에서 각 모델의 성능 결과
비교 표
모델 이름 | Accuracy |
---|---|
large-language-models-encode-clinical | 67.6 |
meditron-70b-scaling-medical-pretraining-for | 70.2 |
galactica-a-large-language-model-for-science-1 | 44.4 |
capabilities-of-gemini-models-in-medicine | 91.1 |
variational-open-domain-question-answering | 55.0 |
shakti-a-2-5-billion-parameter-small-language | 60.3 |
towards-expert-level-medical-question | 79.7 |
grapeqa-graph-augmentation-and-pruning-to | 39.51 |
linkbert-pretraining-language-models-with | 40.0 |
deep-bidirectional-language-knowledge-graph | 47.5 |
meditron-70b-scaling-medical-pretraining-for | 59.2 |
towards-expert-level-medical-question | 83.7 |
can-generalist-foundation-models-outcompete | 90.2 |
medmobile-a-mobile-sized-language-model-with | 75.7 |
towards-expert-level-medical-question | 85.4 |
large-language-models-encode-clinical | 45.1 |
galactica-a-large-language-model-for-science-1 | 22.8 |
small-language-models-learn-enhanced | 70.6 |
large-language-models-encode-clinical | 50.3 |
biobert-a-pre-trained-biomedical-language | 36.7 |
biomedgpt-open-multimodal-generative-pre | 50.4 |
galactica-a-large-language-model-for-science-1 | 23.3 |
can-large-language-models-reason-about | 60.2 |
small-language-models-learn-enhanced | 74.3 |
meditron-70b-scaling-medical-pretraining-for | 61.5 |
large-language-models-encode-clinical | 33.3 |
biobert-a-pre-trained-biomedical-language | 34.1 |