HyperAI초신경

Mathematical Reasoning On Frontiermath

평가 지표

Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

		Paper Title
o3	0.252	-
Gemini 1.5 Pro (002)	0.02	FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
o1-mini	0.01	-
o1-preview	0.01	-
Claude 3.5 Sonnet	0.01	-
GPT-4o	0.01	-

0 of 6 row(s) selected.

Mathematical Reasoning On Frontiermath | SOTA | HyperAI초신경