Addressing the open world: detecting and segmenting pollen on palynological slides with deep learning
Abstract
In the open world, categorical classes are imbalanced, test classes are not known a priori, and test data are captured across different domains. Paleontological data can be described as open-world, as specimens may include new, unknown taxa, and the data collected, such as measurements or images, may not be standardized across different studies. Fossil pollen analysis is one example of an open-world problem in paleontology. Pollen samples capture large numbers of specimens, including not only common types but also rare and even novel taxa. Pollen is diverse morphologically and features can be altered during fossilization. Additionally, there is little standardization in the methods used to capture and catalog pollen images and most collections are mounted on microscope slides. Therefore, generalized workflows for automated pollen analysis require techniques that are robust to these differences and can work with microscope images. We focus on a critical first step, the detection of pollen specimens on a palynological slide and review how existing methods can be employed to build robust and generalizable analysis pipelines. First, we demonstrate how a mixture-of-experts approach -- the fusion of a general pollen detector with an expert model trained on minority classes -- can be used to address taxonomic biases in detections, particularly the missed detections of rarer pollen types. Second, we demonstrate the efficiency of domain fine-tuning in addressing domain gaps -- differences in image magnification and resolution across microscopes, and of taxa across different sample sources. Third, we demonstrate the importance of continual learning workflows, which integrate expert feedback, in training detection models from incomplete data. Finally, we demonstrate how cutting-edge segmentation models can be used to refine and clean detections for downstream deep learning classification models.
Related articles
Related articles are currently not available for this article.