Inocras Launches AI-Powered Cancer Foundation Model Trained on 2,882 Whole Genomes, Advancing Precision Oncology with Learnable Tokenization and Clinical Insights
Inocras, a bioinformatics-driven company leveraging whole genome data and proprietary analytics to advance precision health, has unveiled a groundbreaking cancer foundation model trained on 2,882 whole genomes across diverse cancer types. The milestone marks a significant leap in applying artificial intelligence to precision oncology and will be showcased at the upcoming EMBL Cancer Genomics Conference in Heidelberg, Germany. Developed in collaboration with the Korea Advanced Institute of Science and Technology (KAIST), the model introduces two key innovations: a novel learnable tokenization approach and genome-wide feature aggregation. At its core is DNAChunker, a proprietary architecture built on an H-Net–based hierarchical dynamic chunking system that adaptively segments genomic sequences. Unlike traditional methods that process DNA in fixed-length chunks, DNAChunker identifies regions rich in biological signal while compressing low-information areas—enhancing both accuracy and computational efficiency. This adaptive strategy enables the model to generate highly informative, patient-level genomic embeddings by integrating diverse genomic features—including mutations, structural variations, and regulatory elements—into cohesive representations that reflect tumor biology and clinical outcomes. In benchmark evaluations across genomic representation learning and functional prediction tasks, the model surpassed leading DNA foundation models such as Nucleotide Transformer and DNABERT-2. Its clinical relevance was validated through independent testing: the model achieved 98% accuracy in predicting homologous recombination deficiency (HRD), a critical biomarker for treatment response in ovarian and breast cancers, and 84% accuracy in classifying PAM50 molecular subtypes using DNA data alone—demonstrating one of the first direct links between a DNA-based foundation model and actionable clinical insights. “This represents a pivotal leap toward clinically interpretable, AI-native cancer genomics,” said Jehee Suh, CEO of Inocras. “We have moved from sequencing genomes to understanding them. That’s the inflection point AI brings to oncology. This is where data becomes diagnosis, and where genome insights start to truly shape patient care.” Young Seok Ju, Ph.D., co-founder of Inocras and a conference organizer, will present findings from a subset of the data—1,364 breast cancer whole genomes—on November 11 at 16:15 CET. His talk, titled “A Cancer Foundation Model from 1,364 Breast Cancer Whole Genomes for Patient Stratification,” will highlight how the DNAChunker-powered model enables AI-driven patient classification and molecular subtype discovery. The model strengthens Inocras’s foundation for next-generation precision diagnostics, paving the way for scalable, data-driven approaches to cancer stratification and treatment planning. The company operates a CLIA/CAP-certified laboratory and partners with leading hospitals, pharmaceutical firms, and research institutions globally. For more information, visit inocras.com and follow Inocras on LinkedIn.
