Date

8 months ago

Organization

Publish URL

www.mercor.com

Paper URL

2509.25721

Tags

AI for Science

Artificial Intelligence

APEX is a comprehensive benchmark dataset first released in 2025 by the Mercor research team in collaboration with Harvard Law School and the Scripps Research Institute. It is used to evaluate the performance of cutting-edge artificial intelligence models in high-economic-value knowledge work. The related research paper is titled "...".The AI Productivity Index (APEX)The goal is to measure the performance of cutting-edge AI models in real-world economic tasks, rather than just focusing on abstract reasoning. The current version of this dataset is APEX-v1.0, which contains 200 high-economic-value professional knowledge task cases, covering four typical knowledge-intensive fields: investment banking, management consulting, law, and basic healthcare. Each task corresponds to the analysis, judgment, and documentation work that would require professionals 1–8 hours to complete in real-world work, and is accompanied by citationable evidence and interpretable, fine-grained scoring criteria to objectively measure the quality of the model output.

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.