Signature Distance: Generalizing Energy Statistics

This article has 1 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Comparing empirical distributions is central to generative model evaluation, hypothesis testing and data augmentation in high-dimensional biological data. Established methods such as energy distance summarize each point's relationship to the opposing distribution through a single expected distance, providing sensitivity to location shifts but not to local density or topological structure. We introduce Signature Distance (SD), a metric that compares empirical distributions through the mean absolute difference of their sorted pointwise distance profiles. SD is a structural generalization of energy distance and matches its $\mathcal{O}(n^2)$ computational complexity. On TCGA pan-cancer transcriptomic data, we show that (1) SD detects density changes that energy distance is insensitive to; (2) the per-point SD loss landscape reveals the geometric mechanisms behind known limitations of energy distance as a generative objective; (3) linearly interpolated biological samples that are not detected by energy distance are correctly penalized by SD; (4) SD provides a direct differentiable potential energy for model-free Langevin data expansion, with a bootstrap resampling protocol that stabilises the stopping epoch; and (5) SD is directly usable as a differentiable generative training loss.

Related articles

Related articles are currently not available for this article.