HyperAI

OpenAI is forming a new team called "Applied Evals," signaling a notable shift in how AI talent is being deployed across the industry. The team, led by engineer Shyamal Anadkat, will focus on helping businesses improve complex workflows using AI—such as processing refund requests, migrating code, and handling multi-step reasoning tasks. It will also explore advancements in voice AI. The new roles offer salaries between $255,000 and $325,000 annually, plus equity. Anadkat described evals—evaluations that measure AI model performance—as “the most critical part of actually building AI products.” This emphasis marks a strategic pivot from pure model development to refining AI for real-world applications. The move reflects a broader trend in the AI sector: companies are no longer just hiring engineers to build models. Instead, they’re increasingly seeking individuals with deep, practical expertise in specific domains—software engineering, writing, legal processes, or even humanities—to ensure AI systems deliver meaningful, context-aware results. Anadkat’s team will begin with generalists and gradually bring in specialists based on demand. For example, software engineers will help with code-related tasks, while future hires may include professionals from fields like writing or law, depending on client needs. The team will work exclusively with business customers using OpenAI’s developer platform, operating independently from consumer-facing initiatives like apps or consulting services. Collaborating closely with OpenAI’s sales and business teams, Applied Evals will prioritize projects based on customer needs and model performance gaps. The goal is to define what “good” looks like in specific contexts—moving beyond simple pass/fail evaluations to nuanced, use-case-driven assessments. This shift is driven by the growing complexity of AI applications. While only a handful of experts can build cutting-edge models, many more are needed to adapt them effectively for real-world problems. As Michael Jacobides, a professor at London Business School, noted, the focus has evolved from basic functionality checks to understanding context and asking the right questions. Justin Farris of Read AI said the demand for people who can translate powerful models into practical tools is surging. “There's probably a hundred people in the world that could lead a team to build these frontier models,” he said, “but there's so much work that needs to be done to take those and make them useful.” Tanmai Gopal, CEO of PromptQL, added that as AI moves from general capabilities to specialized tasks, defining success becomes increasingly intricate. “For a lot of applied use cases, the ways of working out what good is and what bad is start to become quite nuanced,” he said. The formation of Applied Evals underscores a maturing AI industry—one where the next frontier isn’t just smarter models, but smarter integration of those models into business processes.

OpenAI Launches New "Applied Evals" Team to Bridge AI Research and Real-World Business Use

Related Links