Language Modelling On Lambada
評価指標
Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | Accuracy |
---|---|
massive-language-models-can-be-accurately | 0.02 |
all-nlp-tasks-are-generation-tasks-a-general | 72.35 |
language-models-are-few-shot-learners | 86.4 |
test-time-training-for-out-of-distribution-1 | 0.01 |
using-deepspeed-and-megatron-to-train | Megatron-Turing NLG 530B (Few-Shot) |
palm-2-technical-report-1 | 83.7 |
language-models-are-few-shot-learners | 72.5 |
broad-context-language-modeling-as-reading | 49.0 |
language-models-are-unsupervised-multitask | 63.24 |
language-models-are-few-shot-learners | 67.1 |
pythia-a-suite-for-analyzing-large-language | - |
palm-2-technical-report-1 | 86.9 |
stay-on-topic-with-classifier-free-guidance | 83.9 |
universal-transformers | 56.25 |
massive-language-models-can-be-accurately | 79.47 |
massive-language-models-can-be-accurately | 76.51 |
palm-scaling-language-modeling-with-pathways-1 | 77.9 |
pythia-a-suite-for-analyzing-large-language | 67.28 |
glam-efficient-scaling-of-language-models | 80.9 |
residual-shuffle-exchange-networks-for-fast | 54.34 |
glm-130b-an-open-bilingual-pre-trained-model | 80.2 |
palm-scaling-language-modeling-with-pathways-1 | 89.7 |
palm-2-technical-report-1 | 80.7 |
モデル 24 | 82.33 |
language-models-are-few-shot-learners | 70.3 |
pythia-a-suite-for-analyzing-large-language | - |
stay-on-topic-with-classifier-free-guidance | 82.2 |
stay-on-topic-with-classifier-free-guidance | 84.0 |
language-models-are-few-shot-learners | 76.2 |
モデル 30 | 69.7 |
mamba-linear-time-sequence-modeling-with | 69.2 |
massive-language-models-can-be-accurately | 75.59 |
pythia-a-suite-for-analyzing-large-language | 70.46 |
palm-scaling-language-modeling-with-pathways-1 | 81.8 |
all-nlp-tasks-are-generation-tasks-a-general | 67.18 |
training-compute-optimal-large-language | 77.7 |
massive-language-models-can-be-accurately | 78.77 |