Question Answering On Multirc
평가 지표
EM
평가 결과
이 벤치마크에서 각 모델의 성능 결과
비교 표
모델 이름 | EM |
---|---|
hungry-hungry-hippos-towards-language | 48.9 |
finetuned-language-models-are-zero-shot | - |
deberta-decoding-enhanced-bert-with | 63.7 |
bloomberggpt-a-large-language-model-for | - |
exploring-the-limits-of-transfer-learning | - |
bloomberggpt-a-large-language-model-for | - |
exploring-the-limits-of-transfer-learning | 63.3 |
bloomberggpt-a-large-language-model-for | - |
kelm-knowledge-enhanced-pre-trained-language | 27.2 |
hungry-hungry-hippos-towards-language | 59.5 |
palm-2-technical-report-1 | - |
designing-effective-sparse-expert-models | - |
bert-pre-training-of-deep-bidirectional | 24.1 |
language-models-are-few-shot-learners | - |
hungry-hungry-hippos-towards-language | 59.7 |
palm-scaling-language-modeling-with-pathways-1 | 69.2 |
ask-me-anything-a-simple-strategy-for | - |
hungry-hungry-hippos-towards-language | 51.4 |
toward-efficient-language-model-pretraining | 63 |
ask-me-anything-a-simple-strategy-for | - |
ask-me-anything-a-simple-strategy-for | - |
palm-2-technical-report-1 | - |
finetuned-language-models-are-zero-shot | - |
n-grammer-augmenting-transformers-with-latent-1 | 11.3 |
designing-effective-sparse-expert-models | - |
alexatm-20b-few-shot-learning-using-a-large | - |
bloomberggpt-a-large-language-model-for | - |
toward-efficient-language-model-pretraining | 62.4 |
palm-2-technical-report-1 | - |
finetuned-language-models-are-zero-shot | - |