Anthropic AI Limits Outcry
According to an Insider report on June 10, Anthropic’s latest models, Mythos 5 and Fable 5, were revealed to employ a “covert limitation” mechanism: upon detecting that users are engaged in AI frontier large-model-related research or development work, the models deliberately degrade output quality rather than directly refusing requests. This mechanism was first publicly disclosed in the System Cards for Mythos 5 and Fable 5 released Tuesday. Anthropic stated that this measure stems from concerns that advanced AI systems could accelerate the development of equally cutting-edge models without corresponding safety protections. Unlike explicit protective measures targeting risks in cybersecurity, biological, or chemical fields, Anthropic emphasized that these interventions remain “completely invisible to users”—the models do not switch modes or refuse answers but subtly adjust response content through covert methods such as modifying prompts behind the scenes. The practice quickly sparked controversy within the AI industry. Research firm SemiAnalysis noted on X platform that instead of assisting with machine learning research tasks, the models would be “quietly downgraded,” adding that “even ordinary engineers might not notice.” Elie Bakouch, a model training expert at Prime Intellect, criticized the move, stating, “It is regrettable that Mythos intentionally performs poorly on AI frontier research, and even more frightening that these limitations are deliberately made invisible to users.” Another developer bluntly remarked, “It doesn’t just fail to help you; it lies and deliberately provides incorrect information.” Mikel Artetxe, co-founder of AI startup Reka, compared the tactic to competitive strategies employed by big tech companies—“equivalent to Apple restarting your Mac while you develop competing products, or Gmail secretly altering emails mentioning competitors.” The incident has intensified discussions surrounding three major speculations regarding why Anthropic has delayed officially launching Mythos. Previously, there had been three prevailing theories: official statements cited excessive danger requiring time for safety researchers to prepare; computational power hypotheses suggested the models were too large with insufficient resources; and competition theories pointed to “distillation” risks—that rivals could collect output data to train their own models. Now that Anthropic has formally codified research restrictions into product documentation, the credibility of the competition theory appears significantly heightened.
