HyperAI

Google unveiled a significantly upgraded version of its AI research agent, Gemini Deep Research, on Thursday, powered by its latest flagship model, Gemini 3 Pro. This new iteration marks a major leap forward in the company’s push toward advanced agentic AI, moving beyond simple report generation to enable developers to embed deep research capabilities directly into their own applications. The enhancements are made possible through Google’s new Interactions API, a tool designed to give developers greater control over how AI agents operate, particularly in complex, multi-step workflows. The updated Gemini Deep Research agent is built to process vast amounts of information, handle extensive context within prompts, and perform intricate reasoning tasks—making it ideal for demanding applications like due diligence, scientific research, and drug toxicity analysis. Google plans to integrate the agent into several of its core services, including Google Search, Google Finance, the Gemini App, and NotebookLM, signaling a shift toward a future where users no longer search for information themselves—instead, their AI agents do the work on their behalf. A key strength of the new agent lies in Gemini 3 Pro’s reputation as Google’s most factually grounded model, specifically trained to reduce hallucinations—instances where AI generates false or fabricated information. This is especially critical in long-running, autonomous tasks where even a single incorrect inference can derail an entire chain of reasoning. To validate its claims, Google introduced a new benchmark called DeepSearchQA, designed to evaluate agents on complex, multi-stage information-seeking challenges. The company also tested the agent on Humanity’s Last Exam, a notoriously difficult benchmark testing broad, niche knowledge, and BrowserComp, which assesses browser-based agent performance. Results showed Gemini Deep Research outperformed competitors on DeepSearchQA and Humanity’s Last Exam. However, OpenAI’s ChatGPT 5 Pro came in as a strong contender, finishing just behind in most tests and narrowly surpassing Google on BrowserComp. But those results were quickly overshadowed by a major development on the same day: OpenAI launched GPT-5.2, codenamed Garlic. OpenAI claims the new model surpasses all rivals, including Google’s latest offering, across a range of standard benchmarks—including its own internal tests. The timing of Google’s announcement, just hours before OpenAI’s release, suggests a deliberate move to stake a claim in the AI race amid intense competition. With both companies pushing the boundaries of agentic AI, the race for dominance in next-generation AI systems is accelerating—each new release not just a product update, but a strategic statement in the battle for the future of intelligent automation.

Related Links

Related Links

Related Links

When Multimodal Computing Begins to Take Off: MiniCPM-o-4.5, With Only 9 Bytes, Covers real-time Image Understanding and Text Generation; vLLM Omni Simultaneously Supports high-throughput Deployment and service-oriented Architecture for Both Text and Multimodal models.

When Multimodal Computing Begins to Take Off: MiniCPM-o-4.5, With Only 9 Bytes, Covers real-time Image Understanding and Text Generation; vLLM Omni Simultaneously Supports high-throughput Deployment and service-oriented Architecture for Both Text and Multimodal models.

Command Palette

Google Unveils Advanced Gemini Deep Research Agent Amid Rivalry with OpenAI’s GPT-5.2 Launch

Related Links

Command Palette

Google Unveils Advanced Gemini Deep Research Agent Amid Rivalry with OpenAI’s GPT-5.2 Launch

Related Links

Command Palette

Google Unveils Advanced Gemini Deep Research Agent Amid Rivalry with OpenAI’s GPT-5.2 Launch

Related Links

When Multimodal Computing Begins to Take Off: MiniCPM-o-4.5, With Only 9 Bytes, Covers real-time Image Understanding and Text Generation; vLLM Omni Simultaneously Supports high-throughput Deployment and service-oriented Architecture for Both Text and Multimodal models.

When Multimodal Computing Begins to Take Off: MiniCPM-o-4.5, With Only 9 Bytes, Covers real-time Image Understanding and Text Generation; vLLM Omni Simultaneously Supports high-throughput Deployment and service-oriented Architecture for Both Text and Multimodal models.