HyperAIHyperAI

Command Palette

Search for a command to run...

Mathematical Reasoning On Aime24

Metriken

Acc

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Paper Title
DeepSeek-r179.8DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Openai-o174.4-
Openai-o1-mini70.0-
Search-o156.7Search-o1: Agentic Search-Enhanced Large Reasoning Models
s1-32B56.7s1: Simple test-time scaling
Openai-o1-preview44.6-
Qwen2.5-72B-Instruct23.3Qwen2.5 Technical Report
Claude3.5-Sonnet16-
0 of 8 row(s) selected.