HyperAIHyperAI

Command Palette

Search for a command to run...

Stanford’s AI Decodes Sleep Data to Predict Over 100 Diseases Years in Advance

A new artificial intelligence system developed by researchers at Stanford Medicine can analyze a single night of sleep data to predict a person’s risk of developing more than 100 different medical conditions. The system, called SleepFM, is the first to use a foundation model approach to extract health insights from vast amounts of sleep data collected through polysomnography—the gold standard sleep test that records brain activity, heart function, breathing, eye movement, and muscle activity during sleep. The model was trained on nearly 600,000 hours of sleep data from 65,000 individuals, including a core group of about 35,000 patients seen at the Stanford Sleep Medicine Center between 1999 and 2024. These patients, aged 2 to 96, had their sleep studies linked with decades of electronic health records, allowing researchers to track long-term health outcomes. Emmanual Mignot, MD, PhD, the Craig Reynolds Professor in Sleep Medicine, and James Zou, PhD, associate professor of biomedical data science, led the study, which will be published in Nature Medicine on January 6. They noted that while polysomnography is routinely used to diagnose sleep disorders, the full scope of physiological data collected during these tests has largely gone underutilized. “Sleep is a kind of general physiology that we study for eight hours in a subject who's completely captive. It's very data rich,” Mignot said. “We record an amazing number of signals, and now AI allows us to make sense of them in ways we never could before.” SleepFM was built as a foundation model, similar in concept to large language models like ChatGPT, but trained on biological signals instead of text. The model breaks down each sleep recording into five-second segments—akin to words in a language—and learns the complex relationships between different physiological signals. A key innovation was the use of leave-one-out contrastive learning, a method that trains the model to reconstruct missing data types, helping it understand how brain, heart, and other signals interact. After training, SleepFM was tested on standard sleep tasks like identifying sleep stages and diagnosing sleep apnea, where it matched or outperformed existing models. The team then turned to a more ambitious goal: predicting future disease. By linking sleep data with long-term health records, the model successfully predicted 130 conditions with reasonable accuracy, including cancers, heart disease, mental health disorders, and neurodegenerative diseases. The model showed particularly strong performance in predicting Parkinson’s disease (C-index 0.89), dementia (0.85), heart attack (0.81), and several cancers, with C-indices above 0.8. The C-index measures how well a model ranks individuals by risk—0.8 means the model correctly ranks 80% of patient pairs in order of who will develop a condition first. Zou noted that even models with lower accuracy, such as those with a C-index of 0.7, are already used in clinical settings, like predicting cancer treatment responses. The team is now working to improve SleepFM’s predictions and understand how it arrives at its conclusions using interpretability tools. They found that combining all data streams—brain, heart, breathing, and muscle activity—yielded the most accurate results. Signals that were out of sync, such as a brain showing signs of deep sleep while the heart appeared active, were especially telling. The study was led by Rahul Thapa, a PhD student in biomedical data science, and Magnus Ruud Kjaer from the Technical University of Denmark. Researchers from Copenhagen University Hospital, BioSerenity, University of Copenhagen, and Harvard Medical School also contributed. Funding came from the National Institutes of Health, Knight-Hennessy Scholars, and the Chan-Zuckerberg Biohub.

Related Links