HyperAI

An internal project at Snorkel AI, codenamed Marlin, is utilizing feedback from approximately 1,000 human software engineers to enhance the coding capabilities of Anthropic's Claude Code. This initiative aims to fine-tune the AI model so its output closely mimics the performance of professional developers, addressing recent advancements that have significantly impacted the software development landscape. As AI companies increasingly rely on third-party contractors for specialized data labeling, this project provides a glimpse into the unseen workforce driving model improvements. Snorkel AI hires contractors with engineering backgrounds to create prompts, review code, and evaluate AI responses. Two contractors interviewed for this report stated they are paid $280 per task, with each assignment requiring roughly one hour. The process involves rigorous A/B testing where freelancers compare code outputs generated by two different model versions and select the preferred option based on project guidelines. The primary objective is to train Claude Code to produce simplified, maintainable code that meets specific prompt requirements. The tasks assigned to contractors are highly specialized, moving beyond general data entry to complex software engineering challenges. Workers were instructed to select real-world GitHub repositories and simulate a Pull Request workflow. In one scenario, contractors prompted the model to reorganize how a system stores execution metadata, ensuring the code remained clearer without altering product functionality. Another task required the model to implement a security fix for MLFlow, an open-source machine learning platform, specifically targeting command injection vulnerabilities while allowing legitimate package options. Contractors evaluated the resulting code based on correctness, security, reliability, and maintainability. They were also tasked with testing how the models handle multi-turn conversation context through follow-up prompts. The industry trend is shifting toward requiring higher levels of expertise for data labeling. Snorkel AI explicitly seeks candidates with advanced degrees, such as Ph.Ds, MDs, or JDs, or equivalent professional experience. While general tasks have been automated, specialized work in fields like software engineering commands premium rates. In addition to Snorkel's reported $280 per task, other platforms like Scale AI and Mercor offer hourly rates up to $110 for similar engineering work. Top experts at these companies can earn over $3,000 weekly. Snorkel AI, founded in 2019 by Stanford researchers, has secured $100 million in Series D funding with a $1.3 billion valuation as of May 2025. Its client list includes major tech labs such as Google, Mistral, and Anthropic. Despite its growth and reliance on high-value contractors, the company recently reduced its workforce by 13% in September. Neither Anthropic nor Snorkel provided comments for this report. This project exemplifies the broader ecosystem of startups, including Handshake and Mercor, that employ hundreds of thousands of global contractors to filter, rank, and train AI responses, ranging from self-driving cars to large language models.

Related Links

Related Links

Related Links

Online Tutorial | UC Berkeley/NVIDIA and Others Release Gsplat, an open-source 3DGS Library That Saves 4x GPU Memory and Reduces Training Time by 10%.

Online Tutorial | UC Berkeley/NVIDIA and Others Release Gsplat, an open-source 3DGS Library That Saves 4x GPU Memory and Reduces Training Time by 10%.

Command Palette

Inside the unseen operation to turbocharge Claude Code

Related Links

Command Palette

Inside the unseen operation to turbocharge Claude Code

Related Links

Command Palette

Inside the unseen operation to turbocharge Claude Code

Related Links

Online Tutorial | UC Berkeley/NVIDIA and Others Release Gsplat, an open-source 3DGS Library That Saves 4x GPU Memory and Reduces Training Time by 10%.

Online Tutorial | UC Berkeley/NVIDIA and Others Release Gsplat, an open-source 3DGS Library That Saves 4x GPU Memory and Reduces Training Time by 10%.