HyperAI

Math Word Problem Solving On Math

المقاييس

Accuracy

النتائج

نتائج أداء النماذج المختلفة على هذا المعيار القياسي

جدول المقارنة
اسم النموذجAccuracy
mixtral-of-experts28.4
palm-2-technical-report-134.3
qwen2-5-math-technical-report-toward83.6
qwen2-5-math-technical-report-toward85.2
tora-a-tool-integrated-reasoning-agent-for56.9
solving-quantitative-reasoning-problems-with43.4
النموذج 764.5
solving-challenging-math-word-problems-using84.3
النموذج 989.7
an-empirical-study-of-data-ability-boundary55.0
cumulative-reasoning-with-large-language72.2
query-and-response-augmentation-cannot-help25.8
openchat-advancing-open-source-language28.6
llama-open-and-efficient-foundation-language-13.9
progressive-hint-prompting-improves-reasoning53.9
measuring-mathematical-problem-solving-with3.0
wizardmath-empowering-mathematical-reasoning33.0
math-shepherd-a-label-free-step-by-step43.5
query-and-response-augmentation-cannot-help35.6
dart-math-difficulty-aware-rejection-tuning-145.5
tora-a-tool-integrated-reasoning-agent-for60.0
tora-a-tool-integrated-reasoning-agent-for50.8
key-point-driven-data-synthesis-with-its48.8
solving-quantitative-reasoning-problems-with5.6
dart-math-difficulty-aware-rejection-tuning-145.3
key-point-driven-data-synthesis-with-its41
gemini-a-family-of-highly-capable-multimodal-132.6
openmathinstruct-2-accelerating-ai-for-math67.8
tora-a-tool-integrated-reasoning-agent-for49.7
dart-math-difficulty-aware-rejection-tuning-143.5
llama-open-and-efficient-foundation-language-18.8
solving-quantitative-reasoning-problems-with14.1
an-empirical-study-of-data-ability-boundary44.3
qwen2-5-math-technical-report-toward88.1
metamath-bootstrap-your-own-mathematical19.4
llama-open-and-efficient-foundation-language-12.9
solving-quantitative-reasoning-problems-with4.4
solving-quantitative-reasoning-problems-with50.3
augmenting-math-word-problems-via-iterative45.0
galactica-a-large-language-model-for-science-116.6
solving-quantitative-reasoning-problems-with47.6
qwen2-technical-report84.0
wizardmath-empowering-mathematical-reasoning14.0
palm-2-technical-report-148.8
openmathinstruct-2-accelerating-ai-for-math76.1
openmathinstruct-1-a-1-8-million-math43.6
measuring-mathematical-problem-solving-with6.9
gemini-a-family-of-highly-capable-multimodal-153.2
solving-challenging-math-word-problems-using71.2
measuring-mathematical-problem-solving-with5.4
deepseekmath-pushing-the-limits-of51.7
openmathinstruct-1-a-1-8-million-math48.3
measuring-mathematical-problem-solving-with6.4
dart-math-difficulty-aware-rejection-tuning-154.9
dart-math-difficulty-aware-rejection-tuning-156.1
solving-challenging-math-word-problems-using60.8
llama-open-and-efficient-foundation-language-16.9
an-empirical-study-of-data-ability-boundary49.5
wizardmath-empowering-mathematical-reasoning10.7
dart-math-difficulty-aware-rejection-tuning-153.6
galactica-a-large-language-model-for-science-120.4
solving-quantitative-reasoning-problems-with25.4
metamath-bootstrap-your-own-mathematical22.5
qwen2-5-math-technical-report-toward75.8
parameter-efficient-sparsity-crafting-from29.9
qwen2-5-math-technical-report-toward85.9
parameter-efficient-sparsity-crafting-from22.6
llama-open-and-efficient-foundation-language-110.6
metamath-bootstrap-your-own-mathematical26.0
النموذج 7041.8
openmathinstruct-1-a-1-8-million-math60.4
mistral-7b13.1
galactica-a-large-language-model-for-science-133.6
cumulative-reasoning-with-large-language58.0
galactica-a-large-language-model-for-science-111.4
solving-quantitative-reasoning-problems-with64.9
openchat-advancing-open-source-language28.9
measuring-mathematical-problem-solving-with6.2
openmathinstruct-1-a-1-8-million-math57.6
branch-train-mix-mixing-expert-llms-into-a17.8
query-and-response-augmentation-cannot-help30.7
measuring-mathematical-problem-solving-with5.2
mathcoder-seamless-code-integration-in-llms30.2
dart-math-difficulty-aware-rejection-tuning-146.6
mathcoder-seamless-code-integration-in-llms45.2
openmathinstruct-1-a-1-8-million-math58.3
measuring-mathematical-problem-solving-with5.6
galactica-a-large-language-model-for-science-18.8
alphamath-almost-zero-process-supervision66.3
macm-utilizing-a-multi-agent-system-for87.920
llama-open-and-efficient-foundation-language-17.1
tora-a-tool-integrated-reasoning-agent-for44.6
solving-quantitative-reasoning-problems-with27.6
math-shepherd-a-label-free-step-by-step48.1
tora-a-tool-integrated-reasoning-agent-for40.1
wizardmath-empowering-mathematical-reasoning22.7
deepseekmath-pushing-the-limits-of58.8
mixtral-of-experts12.7
openmathinstruct-1-a-1-8-million-math45.5
openmathinstruct-1-a-1-8-million-math57.2
tora-a-tool-integrated-reasoning-agent-for43.0
tora-a-tool-integrated-reasoning-agent-for48.1
openmathinstruct-1-a-1-8-million-math44.5
solving-quantitative-reasoning-problems-with33.6
mathcoder-seamless-code-integration-in-llms35.9
solving-challenging-math-word-problems-using69.7
pal-program-aided-language-models51.8
measuring-mathematical-problem-solving-with2.9
key-point-driven-data-synthesis-with-its48.6
math-shepherd-a-label-free-step-by-step33.0
mathcoder-seamless-code-integration-in-llms45.1
step-dpo-step-wise-preference-optimization70.8
mathcoder-seamless-code-integration-in-llms23.3
openmathinstruct-1-a-1-8-million-math55.6
solving-challenging-math-word-problems-using73.5
llama-open-and-efficient-foundation-language-120.5
skills-in-context-prompting-unlocking56.4
llama-open-and-efficient-foundation-language-115.2
openmathinstruct-1-a-1-8-million-math50.7
an-empirical-study-of-data-ability-boundary63.7
solving-quantitative-reasoning-problems-with8.8
galactica-a-large-language-model-for-science-15.2
openmathinstruct-1-a-1-8-million-math60.2
key-point-driven-data-synthesis-with-its46.8
sparks-of-artificial-general-intelligence42.5
mathcoder-seamless-code-integration-in-llms29.9
openmathinstruct-2-accelerating-ai-for-math79.6
toward-self-improvement-of-llms-via51
solving-quantitative-reasoning-problems-with19.1
qwen2-5-math-technical-report-toward79.9
solving-quantitative-reasoning-problems-with1.5
openmathinstruct-1-a-1-8-million-math46.3
openmathinstruct-2-accelerating-ai-for-math71.9
galactica-a-large-language-model-for-science-112.7
dart-math-difficulty-aware-rejection-tuning-152.9