The use of cross-validation has overestimated the value of genomic selection in plant breeding

This article has 5 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Genomic Selection (GS) is widely considered to be a transformative approach for plant breeding, and has been a subject of well over a thousand papers since its proposal 25 years ago. The reduced costs of marker genotyping and genome sequencing, the proliferation of powerful statistical methods, and innovative breeding schemes that leverage GS have promised a revolution in the speed, efficiency, and precision of plant breeding. However, clear evidence of dramatically improved breeding outcomes using GS is difficult to find in the literature. I argue here that the most commonly presented evidence of GS success—high estimated accuracies of Genomic Prediction (GP) models as evaluated by crossvalidation—may be giving a highly misleading impression about the value of GS, at least in moderate-sized breeding programs. Estimating GP accuracy by cross-validation is only appropriate when GS is used to increase selection intensity, one of four key control parameters of the breeders equation and usually the least cost-effective way to increase genetic gain. If GS is instead used to increase the accuracy of selection among a fixed set of candidates or used to speed up breeding cycles, cross-validation-based estimates can be dramatically inaccurate, in ways that differ among breeding populations and traits. Instead, I show that analytical expressions and computational simulations are more informative about the likelihood of success of GS than cross-validation, and can be more effectively employed to evaluate GS program design.

Related articles

Related articles are currently not available for this article.