Discovery of Novel Anticancer Agents and Influenza Potential Biomarkers Through a Mass Spectrometry Foundation Model

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Metabolomics, using non-targeted tandem mass spectrometry, generates rich biological data, but novel metabolites with structures absent from databases are challenging to analyze. Existing algorithms predict isolated chemical features such as molecular fingerprints or structural classes but fail to integrate them into reliable structure-level predictions, particularly for complex metabolites or spectra with high noise and sparse fragments. Here we present ComFaceID, a foundation model that revolutionizes de novo structure profiling through generating 500-dimensional embeddings from MS² spectra, enabling parallel prediction of diverse structural descriptors. ComFaceID consistently outperforms state-of-the-art tools across key tasks, including library search, classification, and fingerprint prediction, even under challenging conditions such as complex metabolomic backgrounds with noisy and information-poor spectra. By integrating outputs from these multi-task predictions, we further developed a multi-parameter framework which significantly enhances prioritization accuracy over single-parameter approaches. When applied to 3,334 actinomycete extracts, ComFaceID uncovered 6 novel compounds across 3 structural classes, including a unique hexahydroindolizine scaffold. Two compounds showed potent, broad-spectrum cytotoxicity superior to doxorubicin, with lower off-target toxicity. In an H1N1 infection model, the ComFaceID pipeline identified over 40 unannotated likely biomarkers, revealing pulmonary inflammation-induced gut metabolic remodeling marked by increased saturated fatty acyl phospholipids and reduced bile acids. By bridging spectral interpretation and novel metabolite discovery, ComFaceID establishes a new workflow for structure-informed metabolomics.

Related articles

Related articles are currently not available for this article.