HyperAI
HyperAI
Startseite
Plattform
Dokumentation
Neuigkeiten
Forschungsarbeiten
Tutorials
Datensätze
Wiki
SOTA
LLM-Modelle
GPU-Rangliste
Veranstaltungen
Suche
Über
Nutzungsbedingungen
Datenschutzrichtlinie
Deutsch
HyperAI
HyperAI
Toggle Sidebar
Seite durchsuchen…
⌘
K
Command Palette
Search for a command to run...
Plattform
Startseite
SOTA
Mathematisches Wortproblem lösen
Math Word Problem Solving On Math
Math Word Problem Solving On Math
Metriken
Accuracy
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Columns
Modellname
Accuracy
Paper Title
Gemini 2.0 Flash Experimental
89.7
-
Qwen2.5-Math-72B-Instruct(TIR,Greedy)
88.1
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
GPT-4 Turbo (MACM, w/code, voting)
87.920
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems
Qwen2.5-Math-72B-Instruct(COT,Greedy)
85.9
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
Qwen2.5-Math-7B-Instruct(TIR,Greedy)
85.2
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
GPT-4-code model (CSV, w/ code, SC, k=16)
84.3
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification
Qwen2-Math-72B-Instruct(greedy)
84.0
Qwen2 Technical Report
Qwen2.5-Math-7B-Instruct(COT,Greedy)
83.6
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
Qwen2.5-Math-1.5B-Instruct(TIR,Greedy)
79.9
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
OpenMath2-Llama3.1-70B (majority@256)
79.6
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
OpenMath2-Llama3.1-8B (majority@256)
76.1
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
Qwen2.5-Math-1.5B-Instruct(COT,Greedy)
75.8
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
GPT-4-code model (CSV, w/ code)
73.5
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification
CR (GPT-4-turbo model, w/ code)
72.2
Cumulative Reasoning with Large Language Models
OpenMath2-Llama3.1-70B
71.9
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
LogicNet (with code interpreter)
71.2
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification
Qwen2-72B-Instruct-Step-DPO (0-shot CoT, w/o code)
70.8
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
GPT-4-code model (w/ code)
69.7
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification
OpenMath2-Llama3.1-8B
67.8
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
AlphaMath-7B-SBS@3
66.3
AlphaMath Almost Zero: Process Supervision without Process
0 of 135 row(s) selected.
Previous
Next
Math Word Problem Solving On Math | SOTA | HyperAI