Towards a Cytometry Foundation Model: Interpretable Sample-level Predictive Modelling via Pretrained Transformers

Zixin Zhuang
Benjamin S. Mashford
Liang Zheng
T. Daniel Andrews

7 evaluations Published on Apr 2, 2026

This article on Sciety

Abstract

Foundation models have transformed scientific data modelling across domains, yet flow cytometry has lacked one. Despite the abundance of high-dimensional cellular data, automated analysis remains bottlenecked by marker variability: prior studies are typically confined to fixed marker panels and homogeneous data, limiting scalability and generalisation due to architectural constraints. We present the Generalised Pretrained Cytometry Transformer (GPCT), an interpretable framework designed to learn from heterogeneous marker panels for sample-level predictive modelling. Through a novel cytometry-specific pretraining regime, GPCT learns transferable cellular representations that achieve high classification accuracy across diverse datasets. Notably, pretraining significantly boosts performance on data-scarce downstream tasks, marking a pivotal step towards a cytometry foundation model. Furthermore, GPCT maintains interpretability and identifies the specific cell subsets most influential to its predictions. This enables direct biological validation of learned patterns and provides a data-driven basis for refining traditional gating strategies.

Related articles are currently not available for this article.