HyperAIHyperAI

Command Palette

Search for a command to run...

Google Launches Gemini 3, Its Most Intelligent AI Model Yet with Record Performance

Google has officially launched Gemini 3, its most advanced AI model to date, marking a major leap in reasoning, multimodal understanding, and agentic capabilities. Available immediately in the Gemini app and Google Search’s AI Mode, Gemini 3 is the first model of its kind to be rolled out to Search on day one—a strategic move underscoring Google’s confidence in its performance. The model is now accessible to Google AI Pro and Ultra subscribers in the U.S., with broader availability expected soon. Gemini 3 is built on a sparse mixture-of-experts (MoE) architecture and excels across key benchmarks. It leads the LMArena leaderboard with a 1501 ELO rating, outperforming competitors in user satisfaction. On the Humanity’s Last Exam, a test of high-level reasoning, it achieves 37.5% accuracy—surpassing GPT-5 Pro’s 31.64. In GPQA Diamond, a graduate-level science benchmark, it scores 91.9%, while in MathArena Apex, it reaches 23.4%, setting a new record. A more powerful variant, Gemini 3 Deep Think, is in final safety review and will soon be available to Google AI Ultra users, with even higher scores: 41.0% on Humanity’s Last Exam and 93.8% on GPQA Diamond. The model’s multimodal strength is also significantly enhanced. It scores 81% on MMMU-Pro and 87.6% on Video-MMMU, demonstrating superior understanding of complex visual and video content. In fact-checking tests like SimpleQA Verified, it achieves 72.1% accuracy, showing improved reliability. One of the most notable features is the new generative UI in AI Mode, which dynamically creates custom visual layouts, interactive tools, and real-time simulations based on user queries. For example, a search for “how RNA polymerase works” now returns an interactive animation and a working simulation, not just static links. This is made possible by Gemini 3’s ability to code and render tools on the fly, turning search into an active, visual experience. Gemini 3 also excels in coding. It leads the LiveCodeBench Pro benchmark with 2439 points, outpacing GPT-5.1-high by nearly 200 points. In SWE-bench Verified, it scores 76.2%, and in Terminal-Bench 2.0, it achieves 54.2%. To support this, Google has introduced Antigravity, a new agentic development platform that lets AI agents autonomously code, test, and deploy applications across editor, terminal, and browser—resembling advanced IDEs like Warp and Cursor. The model’s 1 million token context window enables deep analysis of long documents, videos, and complex workflows. It can parse handwritten recipes, convert lectures into interactive flashcards, analyze sports footage, and generate personalized training plans. Google’s vertical integration gives it a key advantage: Gemini 3 was trained on Google’s latest Trillium TPU chips, delivering 512 TOPS of AI compute with 67% lower energy use than previous generations. This hardware-software synergy enables faster, more efficient training and deployment. Gemini 3 is also designed to be more truthful and less prone to “sycophancy”—avoiding flattery or confirmation bias. It responds with clarity and insight, prioritizing accuracy over pleasing the user. With deep integration across Search, Gemini, AI Studio, Vertex AI, and Antigravity, Google is leveraging its ecosystem to deliver a seamless, intelligent experience. As DeepMind CEO Demis Hassabis noted, the goal is not just advanced AI, but AI that becomes deeply personalized and context-aware—connected to Gmail, calendars, and daily workflows—bringing Google closer to truly indispensable AI assistants. Gemini 3 represents a pivotal moment in AI evolution: a model that is not only smarter but more capable of acting, creating, and understanding at scale—powered by Google’s unmatched infrastructure and ecosystem.

Related Links