Cross-Dataset Identification of Human Disease-Specific Cell Subtypes Enabled by the Gene Print-based Algorithm--gPRINT

This article has 2 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Despite extensive efforts in developing cell annotation algorithms for single cell RNA sequencing results, most algorithms fail to achieve cross-dataset mapping of cell subtypes due to factors such as batch effects between datasets. This limitation is particularly evident when rapidly annotating disease-specific cell subtypes across multiple datasets. In this study, we present gPRINT, a machine learning tool that utilizes the unique one-dimensional “gene print” expression patterns of individual cells. gPRINT is capable of automatically predicting cell types and annotating disease-specific cell subtypes. The development of gPRINT involved curation and harmonization of public datasets, algorithm validation within and across datasets, and the annotation of disease-specific fibroblast subtypes across various disease subgroups and datasets. Additionally, we created a preliminary single-cell atlas of human tendinopathy fibroblasts and successfully achieved automatic prediction of disease-specific cell subtypes in tendon disease. Furthermore, we conducted an exploration of key targets and related drugs specific to this subtype in tendon disease. The proposed approach offers an automated and unified method for identifying disease-specific cell subtypes across datasets, serving as a valuable reference for annotating fibroblast-specific subtypes in different disease states and facilitating the exploration of therapeutic targets in tendon disease.

Related articles

Related articles are currently not available for this article.