<RETURN_TO_BASE

Stanford Researchers Unveil SleepFM Clinical AI Model

A new AI model predicts disease risk from sleep data, enhancing clinical workflows.

Overview of SleepFM Clinical

A team of Stanford Medicine researchers has introduced SleepFM Clinical, a multimodal sleep foundation model that leverages clinical polysomnography data to predict long-term disease risk from a single night’s sleep. This groundbreaking research is published in Nature Medicine, and the clinical code is available as the open source sleepfm-clinical repository on GitHub under the MIT license.

Transforming Polysomnography for Broader Insights

Polysomnography captures brain activity, eye movements, heart signals, muscle tone, breathing effort, and oxygen saturation during a full night in a sleep lab. Traditionally, most clinical workflows have limited its application to sleep staging and sleep apnea diagnosis. This research treats these multichannel signals as a dense physiological time series, training a foundation model for a unified representation across all modalities.

SleepFM is trained on around 585,000 hours of sleep recordings from approximately 65,000 individuals, sourced from multiple cohorts. The largest is from the Stanford Sleep Medicine Center, which includes about 35,000 adults and children undergoing overnight studies from 1999 to 2024. This clinical cohort is linked to electronic health records, facilitating survival analysis for hundreds of disease categories.

Model Architecture and Learning Objectives

SleepFM employs a convolutional backbone to extract local features from each channel, followed by attention-based aggregation across channels and a temporal transformer over short segments of the night. This architecture was previously utilized in SleepFM for sleep stage and sleep disordered breathing detection, demonstrating improved downstream performance by learning joint embeddings across various physiological signals.

The pretraining stage uses leave-one-out contrastive learning. For each short time segment, the model creates distinct embeddings for modality groups—brain, heart, and respiratory signals. It then aligns modality embeddings so that any subset predicts the joint representation of the others, enhancing robustness against missing channels and varying recording setups.

After pretraining on unlabeled polysomnography, the backbone is fixed, and small task-specific heads are trained. For standard sleep tasks, lightweight recurrent or linear heads map embeddings to sleep stages or apnea classifications. In clinical risk prediction, the model aggregates the entire night’s data into a single patient-level embedding, adds demographic details like age and sex, and utilizes a Cox proportional hazards layer for time-to-event modeling.

Performance on Benchmarks

Before advancing to disease prediction, the SleepFM capabilities were benchmarked against existing specialist models in sleep analysis tasks. Previous studies indicate that a simple classifier based on SleepFM embeddings surpasses end-to-end convolutional networks in sleep stage classification and sleep disordered breathing detection, showing notable improvements in AUROC and AUPRC across several public datasets.

In clinical evaluations, the pretrained backbone was reused for sleep staging and apnea severity assessment across multi-center cohorts. Findings indicate that SleepFM competes favorably with conventional convolutional models and automated sleep staging systems, affirming that its representation captures fundamental sleep physiology rather than merely artifacts from a single dataset.

Disease Prediction Capabilities

The standout aspect of this research lies in disease prediction capabilities. The research team mapped diagnosis codes from the Stanford electronic health records to phecodes and identified over 1,000 potential disease classifications. For each phecode, they calculated the time to first diagnosis post-sleep study and fit a Cox model on SleepFM embeddings.

SleepFM can predict 130 disease outcomes from a single night's polysomnography, exhibiting strong discrimination. This includes all-cause mortality, dementia, myocardial infarction, heart failure, chronic kidney disease, stroke, atrial fibrillation, various cancers, and multiple psychiatric and metabolic disorders. Performance indicators are comparable to established risk scores, despite using only sleep data and basic demographics.

Notably, for certain cancers and mental health disorders, predictions reach around 80 percent accuracy over multi-year risk periods, suggesting that nuanced patterns within brain, heart, and breathing signals can reveal latent disease processes that remain clinically undetectable.

Comparison with Simpler Models

To assess the additional value provided by SleepFM, comparisons were made against two baseline models: one relying solely on demographic information and the other employing an end-to-end approach directly on polysomnography outcomes without prior unsupervised training. Results show that the pretrained SleepFM representation combined with a simple survival head produces higher concordance and AUROC across most disease categories compared to both baselines.

This research underscores that the improvements are driven not by a complex prediction head but by the foundation model's ability to generalize sleep physiology representation. Consequently, clinical centers can utilize a single pretrained backbone while developing smaller site-specific heads with relatively modest labeled cohorts to achieve state-of-the-art performance.

🇷🇺

Сменить язык

Читать эту статью на русском

Переключить на Русский