Xaira Therapeutics Releases X-Atlas/Orion: Largest Genome-Wide Perturb-seq Dataset to Advance AI in Biological Research and Drug Discovery
Xaira Therapeutics, a biotechnology company based in South San Francisco, California, has made a significant breakthrough in AI-driven virtual cell modeling with the release of "X-Atlas/Orion," the largest publicly available Perturb-seq atlas. This achievement, announced recently, addresses a critical gap in the field by providing a vast repository of high-quality gene perturbation data, essential for advancing biological foundation models and enhancing our understanding of disease mechanisms. The X-Atlas/Orion dataset includes 8 million cells, targeting all human protein-coding genes, with deep sequencing of over 16,000 unique molecular identifiers (UMIs) per cell. This extensive dataset was generated using Xaira's proprietary "Fix-Cryopreserve-ScRNAseq" (FiCS) Perturb-seq platform, which combines the Chromium platform from 10x Genomics with innovative techniques to overcome logistical challenges in large-scale data generation. The FiCS platform ensures sensitivity, scalability, and reproducibility, allowing for the accurate capture of perturbation-induced transcriptomic changes. One of the key innovations in Xaira's work is the ability to detect dose-dependent genetic effects. Traditional Perturb-seq studies view gene knockdowns as binary—either "on" or "off"—which limits the depth of understanding. Xaira’s method measures the amount of single guide RNA (sgRNA) detected in each cell, indicating the strength of gene suppression. This continuous variable approach provides a more nuanced and detailed picture of genetic function, significantly enhancing the predictive power of biological models. According to Ci Chu, vice president of early discovery at Xaira, “This industrialized platform and the Orion dataset will empower scientists to build more predictive models of complex biology, helping us better understand disease biology and discover drug targets.” Bo Wang, SVP and head of biomedical AI at Xaira, adds, “With the scale and quality provided by this dataset, we are better equipped to model how cells respond across different conditions, a crucial step in training the first generation of virtual cell models.” The release of X-Atlas/Orion marks a landmark year for Xaira Therapeutics, which launched in April 2024. In October 2024, co-founder Dr. David Baker received the Nobel Prize in Chemistry alongside colleagues from Google DeepMind, Drs. Demis Hassabis and John Jumper, for groundbreaking AI-driven advancements in protein structure prediction and novel protein design. This recognition underscores Xaira’s commitment to integrating cutting-edge AI with biological research to drive therapeutic innovation. Xaira Therapeutics aims to revolutionize drug discovery and development by combining expertise in machine learning, data generation, and therapeutic product development. The company's goal is to create more intelligent and effective therapies by leveraging these advanced models to understand biological targets and engineer molecules that directly impact disease. The FiCS Perturb-seq platform and X-Atlas/Orion dataset are significant steps toward this vision, offering researchers unparalleled access to data that can refine their models and accelerate discoveries. The publicly accessible X-Atlas/Orion dataset is available at: https://doi.org/10.25452/figshare.plus.29190726. For readers interested in the detailed methods and findings, the preprint publication can be found at: https://www.biorxiv.org/content/10.1101/2025.06.11.659105v1. Industry experts view the release of X-Atlas/Orion as a game-changer. The dataset’s scale and quality are expected to catalyze the development of more sophisticated and accurate virtual cell models, which can better simulate cellular responses under various conditions. This could lead to faster and more reliable identification of drug targets and a deeper understanding of disease biology. Xaira Therapeutics, with its integrated approach and recent accolades, is positioned at the forefront of this transformative shift in biotechnology and AI convergence.