SpatialFinder: A Human-in-the-Loop Vision-Language Framework for Prioritizing High-Value Regions in Spatial Transcriptomics
Abstract
Sequencing an entire spatial transcriptomics slide can cost thousands of dollars per assay, making routine use impractical. Focusing on smaller regions of interest (ROIs) based on adjacent routine H&E slides offers a practical alternative, but there is (i) no reliable way to identify the most informative areas from standard H&E images alone; and (ii) limited solutions for clinicians to prioritize the microenvironment of their own interests. Here we introduce SpatialFinder, a framework that combines a biomedical vision-language model (VLM) with a human-in-the-loop optimization pipeline to predict gene expression heterogeneity and rank high-value ROIs across routine H&E tissue slides. Evaluated across four Visium HD tissue types, SpatialFinder consistently outperforms baseline VLMs in selecting regions with high cellular diversity and tumor presence, achieving up to 89% correlation with ground truth rankings. These results demonstrate the potential of human-AI collaboration to make spatial transcriptomics more cost-effective and clinically actionable.
Related articles
Related articles are currently not available for this article.