System-level health profiling from blood DNA methylation with explainable deep learning

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Genome-scale DNA methylation (DNAm) profiles capture organismal physiology, but most predictive models lack transparency and multi-level applicability. Here we develop an explainable framework that quantifies respiratory, cardiovascular, and metabolic status as bounded health scores (0–1) derived from sex-specific clinical reference ranges and disease penalties, and then predicts these scores from whole-blood DNAm. Using Generation Scotland and case-control samples (n = 14,496 individuals), we screened 39 covariates for disease relevance and DNAm predictability, yielding system- relevant panels that were aggregated into scores. We compressed DNAm profiles with a protein-interaction-guided autoencoder, and trained health predictors on 128- dimensional embeddings using fully connected networks. On held-out samples, models reproduced the composite scores with strong rank agreement (Spearman ρ = 0.87, R2 = 0.71 for respiratory health; ρ = 0.82, R2 = 0.66 for cardiovascular; ρ = 0.81, R2 = 0.64 for metabolic) and recover expected population structure in a generally healthy cohort, with clear separation between “single-system low” and “multi-system low” phenotypes, and graded coupling across systems without redundancy. Further, the top features retrieved from each explainable predictor aligned with system biology: airway epithelial repair, hypoxia and inflammatory trafficking for respiratory; endothelial remodeling and cardiomyocyte programs for cardiovascular; glucose-lipid metabolism and metaflammation for metabolic. These results show that DNAm embeddings can yield accurate, transparent, and system-aware health profiling from blood, providing actionable summaries while revealing the molecular processes the models use to infer multi-system status. This approach positions DNAm embeddings plus interpretable penalty targets as a practical bridge from epigenomic signal to system-level triage and is extensible for evaluation in larger, more diverse cohorts.

Related articles

Related articles are currently not available for this article.