HyperAIHyperAI

Command Palette

Search for a command to run...

AI scientists improving but hit fundamental limits

New AI systems designed to assist scientific research are demonstrating impressive capabilities while simultaneously revealing fundamental limitations. Two recent studies published in Nature introduce Robin, developed by the non-profit Future House, and Co-Scientist, created by Google DeepMind. These tools represent a shift toward using large language models to collaborate directly with human scientists, rather than attempting to replace them entirely. Both systems operate as multi-agent architectures, deploying specialized digital agents coordinated by a central supervisor. Co-Scientist employs abstract cognitive tasks to mirror human scientific reasoning. It features a reflection agent that acts as a critical peer reviewer and ranking agents that debate research hypotheses in simulated tournaments to evaluate their merit. In contrast, Robin is tailored for specific biomedical challenges, particularly drug repurposing. Its agents focus on selecting experimental tests and analyzing complex data to identify new treatments for diseases. In experimental trials, Co-Scientist identified 30 drug candidates for acute myeloid leukemia. While human oncologists refined this list, testing confirmed that three candidates showed positive results, with one appearing particularly promising. The system also demonstrated potential in exploring complex drug combinations. Similarly, Robin proposed 30 drug candidates for dry age-related macular degeneration. After human scientists selected the top five for testing and adjusted several experimental suggestions, two drugs emerged as promising options. Despite these successes, the studies highlight critical boundaries. Neither system conducted direct physical validation of their hypotheses, relying instead on human experts to define research questions, verify predictions, and prioritize investigations. Furthermore, Co-Scientist's performance was not benchmarked against established, specialized computational methods for drug repurposing, leaving questions about its comparative efficiency. Internal testing also revealed that while some agents excel at reviewing past research, others struggle with statistical questions and bioinformatics without significant human prompting. These findings reinforce the view that while AI can accelerate the generation of ideas and analysis of existing literature, language alone is insufficient for the full scientific process. Scientific inquiry requires precise, quantitative data and the modeling of complex physical systems, areas where text-based models remain imprecise. The current trend shows AI tools acting as powerful assistants that navigate vast documentation and integrate dispersed information. However, true effectiveness requires models that can bridge the gap between linguistic descriptions and structured data, such as genomic sequences and protein structures. Ultimately, AI co-scientists will only become fully transformative when they move beyond connecting words to modeling the intricate realities of the natural world.

Related Links

AI scientists improving but hit fundamental limits | Trending Stories | HyperAI