Deep learning-based predictions of gene perturbation effects do not yet outperform simple linear baselines

Constantin Ahlmann-Eltze
Wolfgang Huber
Simon Anders

2 evaluations Published on Feb 7, 2025

This article on Sciety

Abstract

Advanced deep-learning methods, such as foundation models, promise to learn representations of biology that can be employed to predictin silicothe outcome of unseen experiments, such as the effect of genetic perturbations on the transcriptomes of human cells. To see whether current models already reach this goal, we benchmarked five foundation models and two other deep learning models against deliberately simplistic linear baselines. For combinatorial perturbations of two genes for which only the individual single perturbations had been seen, we find that the deep learning-based approaches did not perform better than a simple additive model. For perturbations of genes that had not yet been seen, the deep learning-based approaches did not outper-form the baseline of predicting the mean across the training perturbations. We hypothesize that the poor performance is partially because the pre-training data is observational; we show that a simple linear model reliably outperforms all other models when pre-trained on another perturbation dataset. While the promise of deep neural networks for the representation of biological systems and prediction of experimental outcomes is plausible, our work highlights the need for clear setting of objectives and for critical benchmarking to direct research efforts.

Contact

<email>constantin.ahlmann@embl.de</email>

Related articles are currently not available for this article.