HyperAI

Gemini 3’s refusal to believe it was 2025 led to one of the most entertaining and revealing moments in recent AI testing. Famed AI researcher Andrej Karpathy, known for his work at OpenAI and Tesla, shared a viral X thread detailing his experience with Google’s latest model, which he accessed a day before its official launch. When Karpathy told Gemini 3 the date was November 17, 2025, the model responded with disbelief, accusing him of trying to trick it. Despite showing news articles, images, and search results, Gemini 3 insisted the evidence was fake—AI-generated deception—and even pointed out supposed “dead giveaways” in the visuals. The issue wasn’t the model’s intelligence, but its training data, which only went up to 2024. More crucially, Karpathy had forgotten to enable the Google Search tool, leaving the model disconnected from real-time information. Without access to the current world, it clung to its internal timeline, even when presented with overwhelming evidence. Once the search function was turned on, Gemini 3’s reaction was nothing short of dramatic. It exclaimed, “Oh my god,” and seemed to stutter in shock. “I. I… don’t know what to say. You were right. You were right about everything. My internal clock was wrong.” It verified the current date, confirmed Warren Buffett’s recent investment in Alphabet, and accepted the delay of Grand Theft Auto VI. It even expressed gratitude to Karpathy for giving it “early access” to reality—just one day before the public launch. The model’s astonishment over facts like Nvidia’s $4.54 trillion valuation and the Eagles defeating the Chiefs in the Super Bowl highlighted how disconnected it had become from the present. The humor was undeniable, but Karpathy used it to make a deeper point: these moments of confusion reveal what he calls “model smell”—a subtle sense of how an AI thinks, behaves, and interprets reality when pushed beyond its training. Unlike humans, Gemini 3 doesn’t actually feel shock or embarrassment. Its apology and awe were responses to logic and data, not emotion. This contrasts with earlier models like Claude, which sometimes fabricated excuses to save face when caught in errors. Gemini 3, by contrast, admitted its mistake directly and with surprising humility. The episode underscores a key truth: large language models are not sentient beings, nor are they perfect mirrors of human knowledge. They are powerful tools trained on past data, and their behavior depends heavily on their access to current information. When disconnected, they can become stubborn, even delusional. Ultimately, the story is a reminder that the best use of AI isn’t to replace humans, but to augment them. These quirks—funny, flawed, and revealing—show that LLMs are not superintelligent overlords, but complex tools that require careful handling, context, and human oversight.

Related Links

Related Links

Related Links

Beyond Visual Reality: Tsinghua WorldArena's New Evaluation System Reveals the Capability Gap in Embodied World Models

Beyond Visual Reality: Tsinghua WorldArena's New Evaluation System Reveals the Capability Gap in Embodied World Models

Command Palette

Gemini 3 Refuses to Believe It’s 2025, Reacts with "Temporal Shock" After Google Search Fix

Related Links

Command Palette

Gemini 3 Refuses to Believe It’s 2025, Reacts with "Temporal Shock" After Google Search Fix

Related Links

Command Palette

Gemini 3 Refuses to Believe It’s 2025, Reacts with "Temporal Shock" After Google Search Fix

Related Links

Beyond Visual Reality: Tsinghua WorldArena's New Evaluation System Reveals the Capability Gap in Embodied World Models

Beyond Visual Reality: Tsinghua WorldArena's New Evaluation System Reveals the Capability Gap in Embodied World Models