The Rayleigh Quotient and Contrastive Principal Component Analysis I
Abstract
Contrastive learning methods can be powerful tools for genomics, enabling the identification of signals in an experiment via dimension reduction while reducing noise using a control. One such popular approach is contrastive PCA, which, despite being used in a variety of settings, does not scale to large datasets. We show that the contrastive PCA objective is an approximation of a Rayleigh quotient, analogous in form to Fisher’s linear discriminant analysis and the common spatial patterns method. The Rayleigh quotient is ρ PCA, satisfies numerous desirable properties, and provides an interpretable form of dimension reduction via generalized eigenvectors. We demonstrate that ρ PCA is more accurate than contrastive PCA and much more efficient. We also show how it can be used not only for dimension reduction of data with respect to a control, but also for contrasting conditions via an analysis of single-nucleus transcriptomics data. Finally, we discuss probabilistic interpretations of ρ PCA that provide further insight into its effective performance.
Related articles
Related articles are currently not available for this article.