Ex-Googlers Launch InfiniMind to Unlock Enterprise Video Data with AI-Powered Insights
Businesses are producing vast amounts of video data—ranging from decades of broadcast archives to thousands of hours of footage from store cameras and production reels—but much of it remains unwatched and unused, creating what’s known as dark data. To unlock the value hidden in this content, Aza Kai and Hiraku Yanagita, two former Googlers with nearly a decade of experience working together at Google Japan, have founded InfiniMind, a Tokyo-based startup building infrastructure to transform unstructured video and audio into structured, searchable business intelligence. Kai, who previously led data science teams at Google Japan across cloud, machine learning, ad systems, and video recommendation models, said the idea emerged during their time at Google as they saw the growing potential of video AI. By 2024, technological advancements had matured enough to make their vision feasible. The key breakthrough came between 2021 and 2023, when vision-language models evolved beyond simple object detection to understand context, narrative flow, and causality—capabilities essential for analyzing long-form video. Until then, most video analysis tools could only tag objects in individual frames, failing to answer complex questions about what happened, why, or who was involved. Falling GPU costs and steady performance improvements over the past decade helped, but the real shift was in model capability, Kai noted. InfiniMind recently raised $5.8 million in seed funding led by UTEC, with participation from CX2, Headline Asia, Chiba Dojo, and an AI researcher from a16z Scout. The company is relocating its headquarters to the U.S. while maintaining operations in Japan, where it tested and refined its technology with demanding clients in a supportive ecosystem. Its first product, TV Pulse, launched in Japan in April 2025. Designed for media and retail companies, it analyzes television content in real time to track product placements, brand visibility, customer sentiment, and public relations impact. After successful pilot programs with major broadcasters and agencies, the platform already has paying customers, including wholesalers and media firms. Now preparing for global expansion, InfiniMind is set to release DeepFrame, its flagship long-form video intelligence platform, in beta in March 2026, with a full launch in April 2026. DeepFrame can process up to 200 hours of video to locate specific scenes, speakers, or events with high precision. Unlike general-purpose tools like TwelveLabs, which serve a broad audience, InfiniMind focuses exclusively on enterprise needs—such as surveillance, safety, compliance, and content analysis—offering a no-code solution where clients upload their data and receive actionable insights. The system integrates visual, audio, and speech understanding, handles unlimited video length, and emphasizes cost efficiency, a key gap in existing solutions. The seed funding will support further model development, infrastructure scaling, engineering hires, and customer acquisition across Japan and the U.S. Kai views the work as a critical step toward artificial general intelligence. “Understanding video is about understanding reality,” he said. “While industrial applications are important, our long-term goal is to build systems that help humans make better decisions by truly comprehending the world around them.”
