Gemini 2.5 Is Fully Updated, Deep Think Supports and Crushes OpenAI

In the early morning of May 21st, Beijing time, the Google I/O 2025 conference arrived as scheduled. In his keynote speech, the company's CEO Sundar Pichai announced a number of important updates, fully demonstrating Google's strong capabilities and growth rate in the field of AI.

"Normally, we don't reveal too much information in the weeks before I/O, because we save the most important models for release at the conference. But in the Gemini era, we are likely to launch the smartest model on a Tuesday in March, or announce exciting breakthroughs like AlphaEvolve a week in advance," said Sundar Pichai. Indeed, readers who follow Google should know that it released a landmark new model like AlphaEvolve just before the conference, raising people's expectations for the I/O conference.

In the keynote speech that just ended, Pichai did not disappoint everyone. In addition to a series of updates to Gemini, he also released the latest developments of Imagen 4, Veo 3, head-mounted displays, XR glasses and other products. This article will introduce the key updates⬇️

Gemini 2.5 full series update

Deep Think is powerful

The update of Gemini 2.5 is expected, but also full of surprises. In March, Google launched its smartest model to date, Gemini 2.5 Pro, and brought the Gemini 2.5 Pro Preview version update to developers two weeks ago. Subsequently, it took the lead in many large model evaluation lists.

For example,It scored 1415 in the programming benchmark WebDev Arena, successfully topping the list.

To further explore Gemini's thinking capabilities,Google has begun testing an enhanced reasoning mode called Deep Think.This approach uses new research techniques to enable the model to consider multiple assumptions before responding.

In terms of effect,The Gemini 2.5 Pro Deep Think version performs well on multiple difficult benchmarks, surpassing OpenAI o3 and o4-mini.These include:

* Achieved excellent results in the 2025 USAMO (United States Mathematical Olympiad) test;

* Leading in LiveCodeBench, a difficult benchmark for competition-level programming skills;

* A high score of 84.0% in MMMU (Multi-Modal Reasoning Test), demonstrating excellent multi-modal reasoning capabilities.

In addition, the Gemma 3 series has also been updated to meet the AI needs of mobile devices.Google, together with Qualcomm, MediaTek, Samsung and other manufacturers, proposed a new cutting-edge framework, Gemma 3n.It uses an innovative technology of Google DeepMind, Per-Layer Embeddings (PLE), to achieve significant memory usage optimization. Although the original parameters of the model are 5 billion (5B) and 8 billion (8B), with the help of PLE technology, these larger models can run on mobile devices or in real-time inference from the cloud with memory overhead equivalent to 2 billion (2B) or 4 billion (4B) parameter models, that is, only 2GB or 3GB of dynamic memory is needed to run.

Veo 3 and Imagen 4, inspire creativity

Compared with the previous generation, Veo 3 not only has a significant improvement in video quality,More importantly, the simultaneous generation of video and audio was achieved for the first time.Whether it’s the sound of traffic on a city street, birds singing in a park, or even conversations between characters, Veo 3 can automatically add audio elements based on text prompts or user needs.

The model also performs well in real physical phenomena such as lip sync, and can understand complex scene descriptions and convert them into dynamic videos. Currently, Veo 3 is online, and Ultra subscribers in the United States can experience it in the Gemini application and Flow, while enterprise users need to obtain usage rights through the Vertex AI platform.

Imagen 4 is one of the highlights of this upgrade.While retaining the advantage of fast image creation, the image detail expression is further improved.Whether it's intricate weaves, water beads or animal hair, they can all be perfectly rendered.

In addition, Imagen 4 is also excellent in processing photorealistic and abstract style images, and can generate high-quality images suitable for printing, display and other occasions according to different needs. It is particularly worth mentioning that its typesetting ability has been greatly improved, which is very suitable for making greeting cards, posters and even comic books. Currently, Imagen 4 has been integrated into Gemini, Whisk, Vertex AI and Google Workspace's slides, videos, documents and other suites for users to use.

HyperAI

Gemini 2.5 Is Fully Updated, Deep Think Supports and Crushes OpenAI

a year ago

Information

Artificial Intelligence

Machine Learning

Deep Learning

Gemini 2.5 full series update

Deep Think is powerful

For example,It scored 1415 in the programming benchmark WebDev Arena, successfully topping the list.

In terms of effect,The Gemini 2.5 Pro Deep Think version performs well on multiple difficult benchmarks, surpassing OpenAI o3 and o4-mini.These include:

* Achieved excellent results in the 2025 USAMO (United States Mathematical Olympiad) test;

* Leading in LiveCodeBench, a difficult benchmark for competition-level programming skills;

* A high score of 84.0% in MMMU (Multi-Modal Reasoning Test), demonstrating excellent multi-modal reasoning capabilities.

Veo 3 and Imagen 4, inspire creativity

Gemini 2.5 Is Fully Updated, Deep Think Supports and Crushes OpenAI

a year ago

Information

Artificial Intelligence

Machine Learning

Deep Learning

Gemini 2.5 full series update

Deep Think is powerful

For example,It scored 1415 in the programming benchmark WebDev Arena, successfully topping the list.

In terms of effect,The Gemini 2.5 Pro Deep Think version performs well on multiple difficult benchmarks, surpassing OpenAI o3 and o4-mini.These include:

* Achieved excellent results in the 2025 USAMO (United States Mathematical Olympiad) test;

* Leading in LiveCodeBench, a difficult benchmark for competition-level programming skills;

* A high score of 84.0% in MMMU (Multi-Modal Reasoning Test), demonstrating excellent multi-modal reasoning capabilities.

Veo 3 and Imagen 4, inspire creativity

Command Palette

Gemini 2.5 Is Fully Updated, Deep Think Supports and Crushes OpenAI

Command Palette

Gemini 2.5 Is Fully Updated, Deep Think Supports and Crushes OpenAI

Related News

OpenAI Releases GeneBench-Pro, Which Assesses AI Research Capabilities Across 129 Questions and 10 domains.

Online Tutorial | 16GB Laptop Achieves Nearly 26B MoE Performance: Gemma 4 12B Based on Innovative Architecture for Unified Processing of Text/Image/Sound Modalities

Tutorial Summary | Open-source Small Models Achieve Overall Intelligence Comparable to GPT-5; one-stop Evaluation of Popular Models Such As Qwen 3.5/Gemma 4.

One-click Deployment of Gemma 4 31B, With up to 256K Context, Comparable in Capabilities to Qwen 3.5 397B.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

Online Tutorial | Up to 4x Faster Generation Speed: DiffusionGemma Can Generate Entire Blocks of Text Simultaneously, With Continuous Optimization Based on multi-round Parallel denoising.

Leveraging Gemini 1.5's Long Contextual Capabilities, Google's Conversational Healthcare System AMIE Achieved the Reasoning Level of a General Practitioner in 100 Scenarios Involving Multiple Patient visits.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Command Palette

Gemini 2.5 Is Fully Updated, Deep Think Supports and Crushes OpenAI

Related News

OpenAI Releases GeneBench-Pro, Which Assesses AI Research Capabilities Across 129 Questions and 10 domains.

Online Tutorial | 16GB Laptop Achieves Nearly 26B MoE Performance: Gemma 4 12B Based on Innovative Architecture for Unified Processing of Text/Image/Sound Modalities

Tutorial Summary | Open-source Small Models Achieve Overall Intelligence Comparable to GPT-5; one-stop Evaluation of Popular Models Such As Qwen 3.5/Gemma 4.

One-click Deployment of Gemma 4 31B, With up to 256K Context, Comparable in Capabilities to Qwen 3.5 397B.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

Online Tutorial | Up to 4x Faster Generation Speed: DiffusionGemma Can Generate Entire Blocks of Text Simultaneously, With Continuous Optimization Based on multi-round Parallel denoising.

Leveraging Gemini 1.5's Long Contextual Capabilities, Google's Conversational Healthcare System AMIE Achieved the Reasoning Level of a General Practitioner in 100 Scenarios Involving Multiple Patient visits.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Related News

OpenAI Releases GeneBench-Pro, Which Assesses AI Research Capabilities Across 129 Questions and 10 domains.

Online Tutorial | 16GB Laptop Achieves Nearly 26B MoE Performance: Gemma 4 12B Based on Innovative Architecture for Unified Processing of Text/Image/Sound Modalities

Tutorial Summary | Open-source Small Models Achieve Overall Intelligence Comparable to GPT-5; one-stop Evaluation of Popular Models Such As Qwen 3.5/Gemma 4.

One-click Deployment of Gemma 4 31B, With up to 256K Context, Comparable in Capabilities to Qwen 3.5 397B.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

Online Tutorial | Up to 4x Faster Generation Speed: DiffusionGemma Can Generate Entire Blocks of Text Simultaneously, With Continuous Optimization Based on multi-round Parallel denoising.

Leveraging Gemini 1.5's Long Contextual Capabilities, Google's Conversational Healthcare System AMIE Achieved the Reasoning Level of a General Practitioner in 100 Scenarios Involving Multiple Patient visits.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Related News

OpenAI Releases GeneBench-Pro, Which Assesses AI Research Capabilities Across 129 Questions and 10 domains.

Online Tutorial | 16GB Laptop Achieves Nearly 26B MoE Performance: Gemma 4 12B Based on Innovative Architecture for Unified Processing of Text/Image/Sound Modalities

Tutorial Summary | Open-source Small Models Achieve Overall Intelligence Comparable to GPT-5; one-stop Evaluation of Popular Models Such As Qwen 3.5/Gemma 4.

One-click Deployment of Gemma 4 31B, With up to 256K Context, Comparable in Capabilities to Qwen 3.5 397B.

Fast and Accurate! Cohere Releases open-source Transcription Model; Accurate Parsing of Complex Scenarios: Chandra-ocr-2 Visual Language Model Achieves Precise OCR.

Online Tutorial | Up to 4x Faster Generation Speed: DiffusionGemma Can Generate Entire Blocks of Text Simultaneously, With Continuous Optimization Based on multi-round Parallel denoising.

Leveraging Gemini 1.5's Long Contextual Capabilities, Google's Conversational Healthcare System AMIE Achieved the Reasoning Level of a General Practitioner in 100 Scenarios Involving Multiple Patient visits.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.