HyperAIHyperAI

Command Palette

Search for a command to run...

18-Year-Old Dropout Entrepreneur Opens Source 10K-Hour Factory Vision Dataset

At just 18 years old, entrepreneur Eddy Xu has made a bold move in the AI and robotics world by launching Build AI and open-sourcing Egocentric-10K, the largest first-person video dataset ever created. The dataset contains 10,000 hours of real-world footage captured from the perspective of factory workers, recorded in actual industrial environments using head-mounted cameras. With a total size of 16.4 TB and over 10.8 billion frames, it has been released on Hugging Face under the permissive Apache 2.0 license, allowing free commercial use and modification. In a post on X, Xu declared, “The era of data expansion for robot learning has arrived. The largest first-person dataset in history is now open.” Unlike lab-based or simulated data, Egocentric-10K captures authentic workflows—ranging from part machining and sorting to assembly, packaging, and quality inspection—offering unprecedented realism for training AI systems. Data analysis reveals that 96.42% of the recorded tasks involve at least one hand, 76.34% require coordinated two-handed actions, and 91.66% include active object manipulation—significantly surpassing prior benchmarks. For comparison, Ego4D reports a hand visibility rate of 67.33%, while EPIC-KITCHENS reaches 90.37%. All videos are stored in full HD MP4 format and organized by factory and worker. Each clip comes with detailed JSON metadata, including factory ID, worker ID, duration, resolution, and frame rate. The dataset is structured using WebDataset format, enabling efficient streaming and partial downloads—ideal for researchers who want to train models on specific environments or worker behaviors. Eddy Xu’s journey is nothing short of extraordinary. In 2021, while still in middle school, he led team 1569A OMEGA from Great Neck to 32nd place at the VEX Robotics World Championship—among 20,000 teams—despite operating from a basement with no formal coaching or funding. He later attended Miller School of Albemarle, where he became a teaching assistant in computer programming and developed engineering tools for the school’s concrete canoe team. Self-taught in Java and Python, he passed the AP Computer Science exam during high school. His entrepreneurial streak led him to raise $120,000 to build a competitive robotics team, win the National Signature Championship, and triumph in DECA’s global business competition among 200,000 participants. He also sold an edtech startup with 178,000 users in just three months. In early 2025, while still a student at Columbia University, Xu developed an AI-powered chess system using Meta’s smart glasses. The system used computer vision to read the board and provided real-time optimal moves—gaining viral attention online. That same year, he dropped out to found Build AI. On his personal website, he stated he turned down over $25 million in equity offers to launch the company. His team includes former founders of unicorns, robotics world champions, and researchers from top labs. Build AI’s mission is clear: “Build physical superintelligence to bring abundance to everyone.” The company recently raised $5 million in seed funding from Abstract Ventures, Pear VC, and HF0, with additional support from ZFellows and chess streamer Alex Botez, who used Xu’s AI glasses in a public demo. The company positions itself as the first to focus exclusively on collecting and scaling “economically useful egocentric human data.” The core idea: record real human actions in real environments, then use that data to teach robots how to do the same. Egocentric vision—capturing the world from the user’s point of view—differs from traditional third-person surveillance. It reveals hand movements, gaze direction, and body-environment interaction, all critical for understanding dexterous tasks. Meta’s EgoMimic project and Figure AI’s “Project Go-Big” have both shown that first-person data significantly improves robot task performance and generalization. Figure AI, for instance, is collecting self-centered videos from over 100,000 homes in partnership with Brookfield, aiming to train its Figure 03 robot with “zero-shot human-to-robot transfer”—teaching robots to navigate and act by watching people, without needing custom robot data. While human video data is abundant and scalable, it faces the “embodiment gap”: human and robot bodies are structurally different, making action transfer complex. In contrast, data from real robot interactions avoids this issue but is expensive and hard to scale. Generalist AI, for example, claims to have trained its GEN-0 model on over 270,000 hours of robot-generated data, growing at 10,000 hours per week. Xu has said Build AI has already collected more egocentric data than any company in history—but details about data access and usage remain limited. The company acknowledges the venture carries high technical risk and low odds of success. Yet its vision is ambitious: “If we’re right, we’ll advance robotics research and fundamentally improve the lives of billions.” The team, composed of world-class talent, is driven by urgency, audacity, and technical excellence. Egocentric-10K’s full dataset and a 30,000-frame evaluation subset are now publicly available on Hugging Face. Researchers can load it directly via the datasets library in Python and select data from specific factories or workers. Build AI confirms the dataset is still growing in scale and quality.

Related Links