CLCNet: a contrastive learning and chromosome-aware network for genomic prediction in plants
Abstract
Genomic selection (GS) is a sophisticated breeding technology that utilizes genomic markers in conjunction with phenotypic data to predict breeding values and phenotypes of candidate populations. The foundation of GS is the construction of precise genomic prediction (GP) models. Traditional GP models, which are typically linear in nature, encounter difficulties in capturing non-linear relationships within genetic data, thereby constraining their capacity to adequately characterize complex genetic architectures. To address this limitation, we have developed a novel deep learning framework, namely contrastive learning and chromosome-aware network (CLCNet), which has been specifically tailored for plant GP. The architecture of CLCNet comprises two key modules: a contrastive learning module designed to capture phenotypic variations, and a chromosome-aware module that accounts for the effects of linkage, as well as local and global epistasis. The performance of CLCNet was evaluated across eight datasets representing seven plant species: maize (Zea mays), rice (Oryza sativa), cotton (Gossypium hirsutum), millet (Setaria italica), chickpea (Cicer arietinum), rapeseed (Brassica napus), and soybean (Glycine max). In comparative assessments against three classical models (rrBLUP, Bayesian Ridge, Bayesian Lasso), two machine learning approaches (LightGBM, SVR), and two other deep learning models (DNNGP, DeepGS), CLCNet exhibited superior performance, achieving higher Pearson correlation coefficients (PCC) and lower root mean squared errors (RMSE). Due to its superior prediction accuracy, generalization capability and scalability in high-dimensional datasets, CLCNet is expected to be a powerful tool for future GS applications in the field of plant breeding.
Related articles
Related articles are currently not available for this article.