RPSLearner: A novel approach based on random projection and deep stacking learning for categorizing NSCLC
Abstract
Background
Lung cancer is the leading cause of cancer death, and non-small cell lung cancer (NSCLC) comprises the largest subtype with most cases. Lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) are two NSCLC subtypes that pose challenges for accurate diagnosis using conventional methods. Existing methods are histological examination and imaging which lacks definitive histologic features and requires intense time.
Methods
To address these concerns, we propose RPSLearner, which combines Random Projection (RP) for dimensionality reduction and stacking ensemble learning to accurately predict lung cancer subtypes. Specifically, multiple independent RP matrices were first generated to project the high-dimensional RNA-seq data into lower-dimensional space, whose features were subsequently concatenated. After that, we fed the fused features into a stack of diverse base classifiers and integrated the predictions from base models via a deep linear layer network.
Results
Benchmarking tests on 1,333 NSCLC patients demonstrated that RPSLearner outperformed state-of-the-art approaches for lung cancer subtype classification. Specifically, RPSLearner efficiently preserved sample-to-sample distances even after significant dimension reduction, and the meta-model in RPSLearner yielded consistently higher accuracy, F1 and AUC scores than individual base models and state-of-the-art approaches for lung cancer subtyping.
Besides, the feature fusion method applied in RPSLearner shown better performance than conventional scores ensemble methods.
Conclusion
We developed a novel stacking learning method called RPSLearner which combines RP and stacking learning, enabling efficient and accurate identification of NSCLC subtypes. RPSLearner is a promising lung cancer subtyping model for downstream lung cancer clinical diagnosis and personalized treatment, and the framework holds the potentiality to be extended to subtyping of other types of cancer.
Related articles
Related articles are currently not available for this article.