AI Agents Still Can't Replace Human Consultants—Yet, Says Mercor CEO
Brendan Foody, CEO of Mercor, a company specializing in AI training, says that while current AI agents cannot yet replace human consultants, they are rapidly advancing and could soon do so. He also hinted that an initial public offering (IPO) for Mercor may be on the horizon. Mercor recently conducted a rigorous test of leading AI models acting as agents across real-world consulting, banking, and legal tasks. The results, part of the APEX-Agents benchmark, showed that AI agents succeeded in completing tasks less than 25% of the time on the first attempt. Even with up to eight tries, success rates only reached 40%. The most advanced models, including OpenAI’s GPT-5.2 and Anthropic’s Opus 4.6, performed best in management consulting, with GPT-5.2 achieving nearly 23% success and Opus 4.6 reaching 33%—a significant jump from earlier versions like GPT-3, which managed only 3%. The tasks were designed to mirror actual consulting work, based on input from top firms such as McKinsey, BCG, Deloitte, Accenture, and EY. One example involved analyzing market penetration using a specific methodology, which the AI agents consistently failed to execute correctly. Foody explained that while AI excels at research and data analysis, it struggles with complex, multi-step tasks that require long-term planning and contextual awareness. Unlike humans, AI agents often fail to navigate file systems effectively, leading them to retrieve incorrect or irrelevant information. They also have difficulty coordinating multiple tools and cross-referencing documents across different sources. Tasks that can be completed in under an hour or require only a single tool are handled much better. Foody likened the current state of AI agents to that of junior interns—capable but inconsistent, requiring significant human oversight. Frank Jones, a former KPMG consultant now working with Mercor, echoed these concerns. He noted that AI models often miss subtle cues in consulting language, such as the phrase “client-ready,” which humans understand intuitively but which requires highly specific prompting for AI to grasp. Despite these limitations, Foody is confident that AI will soon displace many entry-level consulting roles. He attributes the rapid progress to better training data and increased investment from leading AI labs, not breakthroughs in architecture. Mercor, which counts OpenAI, Anthropic, and Meta among its clients, has grown its revenue by 4,658% in 2025 and recently secured a funding round valuing the company at $10 billion. Looking ahead, Mercor plans to expand its benchmark to evaluate entire professional services firms—not just individual analysts—potentially showing that AI systems could eventually replicate the full value chain of a firm like McKinsey. Foody believes that in the next two years, chatbots could match the performance of the best consulting firms. For now, firms like McKinsey can still claim AI enhances efficiency without replacing people. But the next version of the benchmark, Foody warns, will tell a very different story—one that could reshape the future of professional services.
