Command Palette
Search for a command to run...
APEX AI Productivity Benchmark Dataset
Date
Publish URL
Paper URL
APEX is a comprehensive benchmark dataset first released in 2025 by the Mercor research team in collaboration with Harvard Law School and the Scripps Research Institute. It is used to evaluate the performance of cutting-edge artificial intelligence models in high-economic-value knowledge work. The related research paper is titled "...".The AI Productivity Index (APEX)The goal is to measure the performance of cutting-edge AI models in real-world economic tasks, rather than just focusing on abstract reasoning.
The current version of this dataset is APEX-v1.0, which contains 200 high-economic-value professional knowledge task cases, covering four typical knowledge-intensive fields: investment banking, management consulting, law, and basic healthcare. Each task corresponds to the analysis, judgment, and documentation work that would require professionals 1–8 hours to complete in real-world work, and is accompanied by citationable evidence and interpretable, fine-grained scoring criteria to objectively measure the quality of the model output.

Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.