Accurate identification of medulloblastoma subtypes from diverse data sources with severe batch effects by RaMBat
Abstract
As the most common pediatric brain malignancy, medulloblastoma (MB) includes multiple distinct molecular subtypes characterized by clinical heterogeneity and genetic alterations. Accurate identification of MB subtypes is essential for downstream risk stratification and tailored therapeutic design. Existing MB subtyping approaches perform poorly due to limited cohorts and severe batch effects when integrating various MB data sources. To address these concerns, we propose a novel approach called RaMBat for accurate MB subtyping from diverse data sources with severe batch effects. Benchmarking tests based on 13 datasets with severe batch effects suggested that RaMBat achieved a median accuracy of 99%, significantly outperforming state-of-the-art MB subtyping approaches and conventional machine learning classifiers. RaMBat could efficiently deal with the batch effects and clearly separate subtypes of MB samples from diverse data sources. We believe RaMBat will bring direct positive impacts on downstream MB risk stratification and tailored treatment design.
Related articles
Related articles are currently not available for this article.