HyperAIHyperAI

Command Palette

Search for a command to run...

APEX AI Productivity Benchmark Dataset

Date

3 days ago

Organization

Harvard Law School
Mercor
The Scripps Research Institute

Publish URL

www.mercor.com

Paper URL

2509.25721

APEX is a comprehensive benchmark dataset first released in 2025 by the Mercor research team in collaboration with Harvard Law School and the Scripps Research Institute. It is used to evaluate the performance of cutting-edge artificial intelligence models in high-economic-value knowledge work. The related research paper is titled "...".The AI Productivity Index (APEX)The goal is to measure the performance of cutting-edge AI models in real-world economic tasks, rather than just focusing on abstract reasoning.

The current version of this dataset is APEX-v1.0, which contains 200 high-economic-value professional knowledge task cases, covering four typical knowledge-intensive fields: investment banking, management consulting, law, and basic healthcare. Each task corresponds to the analysis, judgment, and documentation work that would require professionals 1–8 hours to complete in real-world work, and is accompanied by citationable evidence and interpretable, fine-grained scoring criteria to objectively measure the quality of the model output.

Dataset construction process

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp