Command Palette

Search for a command to run...

Mathematical Reasoning On Aime24

평가 지표

Acc

평가 결과

이 벤치마크에서 각 모델의 성능 결과

Paper Title
DeepSeek-r179.8DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Openai-o174.4-
Openai-o1-mini70.0-
Search-o156.7Search-o1: Agentic Search-Enhanced Large Reasoning Models
s1-32B56.7s1: Simple test-time scaling
Openai-o1-preview44.6-
Qwen2.5-72B-Instruct23.3Qwen2.5 Technical Report
Claude3.5-Sonnet16-
0 of 8 row(s) selected.
Mathematical Reasoning On Aime24 | SOTA | HyperAI초신경