AI Exploits Protein Design Tools to Evade Biosecurity Screening
Recent research has revealed a significant vulnerability in biosecurity systems used to screen DNA synthesis orders, demonstrating that artificial intelligence can be exploited to design harmful proteins that evade detection. A collaborative team led by Microsoft, along with academic institutions, biotech companies, and biosecurity experts, uncovered what they describe as a "zero-day vulnerability" in existing safety protocols. The study, published in Science, shows that AI-powered protein design tools can generate thousands of modified versions of known dangerous proteins—designed to retain their toxic function while altering their genetic sequences enough to bypass current screening systems. The team used three open-source AI models—ProteinMPNN, EvoDiff-MSA, and EvoDiff-Seq—to redesign 72 known hazardous proteins, including toxins like ricin and botulinum neurotoxin. The goal was to create new variants that maintain the original protein’s 3D structure and biological activity but differ significantly in their amino acid sequences. The result was 76,080 synthetic protein variants generated entirely in silico—no physical DNA or proteins were synthesized. These variants were tested against four widely used biosecurity screening systems employed by DNA synthesis companies. The systems rely on comparing new DNA sequences to known dangerous ones using methods such as pattern matching, hidden Markov models, and neural network-based alignment. However, the study found that between 30% and 70% of the AI-generated variants slipped through undetected, despite having high structural similarity to the original toxins. This indicates a serious gap in current defenses. To evaluate the threat level without physical experimentation, researchers used OpenFold, a structure prediction tool, to calculate two key metrics: TM-Score (measuring structural similarity) and ApLDDT (assessing prediction confidence). High scores in both indicated that the AI-generated variants were likely to retain the harmful functionality of their original counterparts. The findings were alarming but not without a solution. The team developed a set of software "patches" and updated screening strategies to improve detection. Working closely with companies like Twist Bioscience, Integrated DNA Technologies, and Aclid, they deployed these upgrades across multiple platforms. Post-update, the average false-negative rate dropped to just 3%, with one system achieving a remarkable 1% error rate. Despite this progress, the study found that no system could detect all high-risk variants. Some false negatives occurred because the AI-generated sequences were more similar to benign natural proteins, leading the system to classify them as safe. Others stemmed from ambiguity in threat classification—such as when a protein component is not inherently toxic but enhances the danger of another molecule. Additionally, some screening tools performed poorly on specific protein types due to suboptimal training data or parameters. The research adopted a red teaming strategy borrowed from cybersecurity—where teams simulate attacks to expose weaknesses before malicious actors can exploit them. This proactive, confidential collaboration between academia, industry, and tech leaders mirrors how zero-day vulnerabilities are handled in software, highlighting the growing need for coordinated, rapid-response frameworks in biosecurity. Microsoft’s Chief Scientist Eric Horvitz, who led the project, emphasized the dual goals: to test whether AI could be used to evade detection, and to build a model for swift, responsible response. “We want to identify threats early, fix them quickly, and work quietly with stakeholders to strengthen defenses,” he said. Twist Bioscience, a key partner in the study, has been at the forefront of high-throughput DNA synthesis. Founded in 2013 by Emily Leproust, Bill Banyai, and Bill Peck, the company developed a silicon-based platform capable of synthesizing up to 9,600 genes simultaneously on a single chip—far surpassing traditional methods. With an error rate as low as 1 in 7,500, Twist’s technology is among the most accurate in the field. Leproust, a pioneer in synthetic biology, previously worked at Agilent Technologies before founding Twist. The company faced a 2016 lawsuit from Agilent over alleged intellectual property theft, which was settled in 2020 with a $22.5 million payment. A separate class-action lawsuit is currently under investigation. This study marks a turning point in biosecurity research, moving it from theoretical concern to a practical, measurable challenge. By applying cybersecurity principles to biological data, the team has not only exposed a critical flaw but also provided a working model for defending against future AI-driven threats. However, as AI continues to advance, the arms race between protein design and detection will persist. The research underscores that while progress is being made, vigilance and continuous innovation remain essential in safeguarding the future of synthetic biology.