SmartHisto: Bayesian Active Learning for Histology Images

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Accurate and efficient characterization of biological images is crucial for advancing systems biology and medical research. Recent advancements in deep learning and image processing have enabled neural network models to rapidly accelerate image analysis by utilizing large expert-annotated datasets. However, in histopathology, the size of whole-slide images makes expert annotation expensive, limiting the acquisition of sufficiently large annotated datasets and posing a major challenge for developing automated, AI-driven image analysis pipelines. To address this limitation, we propose a novel active learning-based framework to train image segmentation models interactively. Our approach employs a Bayesian neural network to identify informative regions in unlabeled images rather than entire images, making expert labeling more cost-effective. We validate our framework on multiple benchmark datasets spanning different staining techniques and magnifications, demonstrating substantial reductions in annotation effort. Notably, our method achieves a mean IoU of 0.75, significantly outperforming competing approaches, which average 0.60.

Author summary

Histopathology is fundamental to investigating tissue and immune responses, host-pathogen interactions, and disease mechanisms. However, histopathology is highly resource-intensive and requires specialized training, dramatically increasing the costs of annotating whole-slide images and, consequently, the expenses of large-scale studies involving numerous labs and specialists. We developed a computational tool to overcome these challenges, implementing a robust uncertainty-based sampling algorithm in conjunction with a next-generation Bayesian Convolutional Neural Network. This algorithm can be used for hypothesis testing and discovery by reducing the reliance on large, precisely annotated training datasets required in automated image analysis pipelines. The base model, when trained to identify lung tissue types using a small set of annotated images, outperforms state-of-the-art models and can efficiently annotate thousands of images much more quickly than a human. Models trained by the proposed algorithm will serve as a standardized approach for pathologists and disease researchers to train automated image segmentation pipelines for large-scale histopathology.

Related articles

Related articles are currently not available for this article.