Multi Task Language Understanding On Mgsm

评估指标

Average (%)

评测结果

各个模型在此基准测试上的表现结果

模型名称
Average (%)
Paper TitleRepository
U-PaLM 540B (CoT)49.9Transcending Scaling Laws with 0.1% Extra Compute-
PaLM 540B55.0PaLM: Scaling Language Modeling with Pathways-
PaLM 2 (few-shot, k=8, SC)87.0PaLM 2 Technical Report-
Flan-U-PaLM 540B (CoT)60.4Scaling Instruction-Finetuned Language Models-
Flan-PaLM 540B (8-shot, fine-tuned, CoT + SC)72.0Scaling Instruction-Finetuned Language Models-
code-davinci-00235Scaling Instruction-Finetuned Language Models-
Flan-PaLM 540B (8-shot, fine-tuned, CoT)57.0Scaling Instruction-Finetuned Language Models-
GPT-3 Davinci 175B5.7Scaling Instruction-Finetuned Language Models-
text-davinci-00336Scaling Instruction-Finetuned Language Models-
Flan-PaLM 540B (8-shot, fine-tuned)21.2Scaling Instruction-Finetuned Language Models-
text-davinci-00223.7Scaling Instruction-Finetuned Language Models-
PaLM 2 (8-shot, CoT)72.2PaLM 2 Technical Report-
0 of 12 row(s) selected.
Multi Task Language Understanding On Mgsm | SOTA | HyperAI超神经