Math Word Problem Solving On Mawps

평가 지표

Accuracy (%)

평가 결과

이 벤치마크에서 각 모델의 성능 결과

		Paper Title
OpenMath-CodeLlama-70B (w/ code)	95.7	OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
MsAT-DeductReasoner	94.3	Learning Multi-Step Reasoning by Solving Arithmetic Tasks
ATHENA (roberta-large)	93	ATHENA: Mathematical Reasoning with Thought Expansion
Exp-Tree	92.3	An Expression Tree Decoding Strategy for Mathematical Equation Generation
Multi-view	92.3	Multi-View Reasoning: Consistent Contrastive Learning for Math Word Problem
ATHENA (roberta-base)	92.2	ATHENA: Mathematical Reasoning with Thought Expansion
Roberta-DeductReasoner	92	Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction
DeBERTa (PM + VM)	91.0	Math Word Problem Solving by Generating Linguistic Variants of Problem Statements
Graph2Tree with RoBERTa	88.7	Are NLP Models really able to Solve Simple Math Word Problems?
EPT	88.7	EPT-X: An Expression-Pointer Transformer model that generates eXplanations for numbers
GTS with RoBERTa	88.5	Are NLP Models really able to Solve Simple Math Word Problems?
GEO	85.1	Generating Equation by Utilizing Operators : GEO model
EPT-X	84.57	EPT-X: An Expression-Pointer Transformer model that generates eXplanations for numbers
EPT	84.51	Point to the Expression: Solving Algebraic Word Problems using the Expression-Pointer Transformer Model
Graph2Tree	83.7	Graph-to-Tree Learning for Solving Math Word Problems
LLaMA 2-Chat	82.4	Llama 2: Open Foundation and Fine-Tuned Chat Models
GPT-3.5 turbo (175B)	80.3	Math Word Problem Solving by Generating Linguistic Variants of Problem Statements
Toolformer	44.0	-
GPT-3 (175B)	19.8	-
Toolformer (disabled)	15.0	-

0 of 25 row(s) selected.

Command Palette

Math Word Problem Solving On Mawps

평가 지표

평가 결과