Math Word Problem Solving On Asdiv A
المقاييس
Execution Accuracy
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
| Paper Title | ||
|---|---|---|
| ATHENA (roberta-large) | 91 | ATHENA: Mathematical Reasoning with Thought Expansion |
| MMOS-DeepSeekMath-7B(0-shot) | 87.6 | An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning |
| ATHENA (roberta-base) | 86.4 | ATHENA: Mathematical Reasoning with Thought Expansion |
| MMOS-CODE-34B(0-shot) | 85.1 | An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning |
| OpenMath-CodeLlama-70B (w/ code) | 84.7 | OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset |
| Graph2Tree with RoBERTa | 82.2 | Are NLP Models really able to Solve Simple Math Word Problems? |
| GTS with RoBERTa | 81.2 | Are NLP Models really able to Solve Simple Math Word Problems? |
| MMOS-CODE-7B(0-shot) | 78.6 | An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning |
| LSTM Seq2Seq with RoBERTa | 76.9 | Are NLP Models really able to Solve Simple Math Word Problems? |
0 of 9 row(s) selected.