HyperAIHyperAI

Command Palette

Search for a command to run...

Mathematical Reasoning On Frontiermath

Metrics

Accuracy

Results

Performance results of various models on this benchmark

Paper Title
o30.252-
Gemini 1.5 Pro (002)0.02FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
o1-mini0.01-
o1-preview0.01-
Claude 3.5 Sonnet0.01-
GPT-4o0.01-
0 of 6 row(s) selected.