AR(2) eigenvalue modulus as a measure of temporal persistence in gene expression: circadian hierarchy emerges from two coefficients
Abstract
Background: The mammalian circadian clock organizes gene expression into a hierarchical architecture, but no single quantitative metric has previously captured this hierarchy from expression data alone without biological labels. Methods: We apply second-order autoregressive (AR(2)) modeling to gene expression time series and extract the eigenvalue modulus |λ| as a measure of temporal persistence — the degree to which a gene's past expression determines its future. We analyze 10 independent GEO datasets yielding 36 tissue- or condition-level time series spanning 4 species (mouse, human, baboon, Arabidopsis ), 12 mouse tissues, and multiple experimental conditions. A twelve-analysis robustness suite, five canonical ODE model validations, AR(1)/AR(2)/AR(3) model order comparisons, literature cross-referencing against 59 curated circadian genes, three automated bias tests, and extension to non-circadian biology assess reliability and generality. Results: The eigenvalue modulus |λ| blindly recovers the known circadian hierarchy: clock genes (median |λ| = 0.647, grand mean = 0.689, p < 0.001) > clock-controlled target genes (median |λ| = 0.529) > genome background (median |λ| = 0.496). This hierarchy is preserved across all 12 mouse tissues (12/12), human blood under three conditions (3/3), baboon tissues (7/8 directionally correct; 4/8 bootstrap-significant), and Arabidopsis (3/3 replicates). The hierarchy survives sub-sampling (to N=8), bootstrap resampling (clock ranked #1 in 100% of 2,000 iterations), linear detrending (12/12 tissues), permutation testing (p < 0.001, 10,000 shuffles), and leave-one-tissue-out cross-validation (12/12 stable). Bmal1-knockout data (GSE70499) supports a causal interpretation: genetic ablation collapses the hierarchy (gap: +0.152 → −0.005). Literature validation recovered 58 of 59 curated circadian-regulated genes in at least one of 21 datasets (any-dataset recall: 98.3%). A Popper-faithful falsification test shows that BMAL1 (Arntl) predicts 8.4% of genome-wide coupling events compared to 0.0–0.3% for arrhythmic housekeeping and random gene controls (~180-fold enrichment over arrhythmic controls); amplitude-matched non-clock controls coupled to ~12% of the genome, yielding a ~2× specificity ratio over oscillation-matched genes. Three automated bias tests confirm the hierarchy depends on temporal order (destroyed by time-shuffle), is uncorrelated with irrelevant metrics, and survives expression-level matching. In an exploratory analysis of non-circadian biology (dendritic cell immune response, GSE59784, 7 timepoints), fast immune responders (|λ| ≈ 0.43–0.55) show lower persistence than sustained effectors (|λ| ≈ 0.80–0.99). In a cancer state-swap analysis (GSE221103, neuroblastoma MYC ON/OFF, 14 timepoints), proliferation markers (median |λ| = 0.954) show dramatically higher persistence than cell identity markers (|λ| = 0.517; d = −1.43, p = 0.001) in the cancer state; MYC activation selectively amplifies proliferation marker persistence by +0.335 while identity markers shift by only +0.057. The eigenvalue modulus is independent of mRNA half-life (weighted mean ρ = 0.012 across 7 datasets, 22,989 genes). Conclusions: The AR(2) eigenvalue modulus is a valid, robust, and biologically meaningful measure of temporal persistence in gene expression. A two-coefficient model recovers circadian hierarchy without prior biological knowledge, validated across species, tissues, experimental perturbations, literature benchmarks, non-circadian immune dynamics, and cancer biology.
Related articles
Related articles are currently not available for this article.