ConvCGP: A Convolutional Neural Network to Predict Genotypic Values of Rice Traits from Compressed Genome-Wide Polymorphisms
Abstract
The growing size of genome-wide polymorphism data in animal and plant breeding has raised concerns regarding computational load and time, particularly when predicting genotypic values for target traits using genomic prediction. Although several deep learning and conventional methods, including dimensionality reduction techniques, such as principal component analysis (PCA) and autoencoders, have been proposed to address these challenges by selecting subsets of polymorphisms or compressing high-dimensional data for predictive analysis. However, these methods are often computationally intensive and time-consuming. A major challenge in applying high-dimensional genomic data directly to deep-learning models is the substantial computational cost and time required for hyperparameter tuning and model training. To address these limitations, we propose a novel deep learning approach that combines convolutional neural networks (CNNs) to predict the genotypic data of target traits with autoencoders to compress high-dimensional genome-wide polymorphism data. We tested this framework on high-dimensional rice datasets, focusing on agronomic trait prediction. By combining CNNs with autoencoders, our framework outperformed other machine-learning methods and recently proposed compression methods, demonstrating its potential to efficiently address the computational challenges associated with high-dimensional genomic data.
Related articles
Related articles are currently not available for this article.