New AI Tool Rivals AlphaFold 3 in Mapping RNA Structures
Researchers at Virginia Tech have developed RNAbpFlow, a novel artificial intelligence system capable of mapping the complex three-dimensional structures of RNA molecules with accuracy rivaling industry benchmarks. Published recently in Nature Methods, the tool addresses a long-standing bottleneck in computational biology by predicting RNA folding using significantly less training data than existing methods. Led by doctoral student Sumit Tarafder and senior author Associate Professor Debswapna Bhattacharya, the team engineered RNAbpFlow to leverage flow matching, a generative AI technique traditionally associated with image synthesis. Unlike dominant models that rely on vast evolutionary sequence databases to infer structure, RNAbpFlow generates complete, all-atom 3D conformations directly from nucleotide sequences and base-pair interactions. In a rigorous blind benchmark against AlphaFold 3, the Virginia Tech model correctly predicted the overall structure for twelve of fourteen RNA targets, outperforming Google DeepMind system, which scored eight. The approach starts from random noise and progressively refines the molecular geometry, allowing researchers to sample dynamic structural states that static predictions often miss. This data efficiency is particularly critical for RNA, which exhibits high structural flexibility and suffers from sparse representation in existing genomic repositories. The ability to accurately visualize RNA folding has direct implications for precision medicine. Many therapeutic strategies, including the approved drug risdiplam for spinal muscular atrophy, depend on small molecules binding to specific RNA conformations. Current drug discovery pipelines are frequently stalled by the inability to reliably model these target structures. RNAbpFlow capacity to rapidly identify binding pockets could accelerate the development of treatments for neurodegenerative disorders, oncology targets, and viral infections. The researchers acknowledge that the model performance may lag behind evolutionary-data-dependent tools when processing larger, highly complex RNAs. Nevertheless, its strengths in low-data regimes make it a valuable addition to the structural biology toolkit. To support reproducible research, the team has publicly released the full implementation, training datasets, and source code. Tarafder is currently integrating improvements into a next-generation iteration scheduled for presentation at the upcoming CASP community prediction competition. The project received funding from the National Institutes of Health and the National Science Foundation.
