Generative Models Validation via Manifold Recapitulation Analysis

Nicolo’ Lazzaro
Gianluca Leonardi
Raffaele Marchesi
Massimiliano Datres
Anna Saiani
Jacopo Tessadori
Alejandro Granados
Johan Henriksson
Marco Chierici
Giuseppe Jurman
Toma Tebaldi
Gabriele Sales

1 evaluations Published on Nov 18, 2024

This article on Sciety

Abstract

Summary

Single-cell transcriptomics increasingly relies on nonlinear models to harness the dimensionality and growing volume of data. However, most model validation focuses on local manifold fidelity (e.g., Mean Squared Error and other data likelihood metrics), with little attention to the global manifold topology these models should ideally be learning. To address this limitation, we have implemented a robust scoring pipeline aimed at validating a model’s ability to reproduce the entire reference manifold. The Python library Cytobench demonstrates this approach, along with Jupyter Notebooks and an example dataset to help users get started with the workflow. Manifold recapitulation analysis can be used to develop and assess models intended to learn the full network of cellular dynamics, as well as to validate their performance on external datasets.

Availability

A Python library implementing the scoring pipeline has been made available via pip and can be inspected at<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/lazzaronico/cytobench/">GitHub</ext-link>alongside some Jupyter Notebooks demonstrating its application.

Contact

<email>nlazzaro@fbk.eu</email>or<email>toma.tebaldi@unitn.it</email>

Related articles are currently not available for this article.