The Naïve Bayes Classifier++ as an Out-of-Distribution Detector of Novel Taxa

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Detecting sequences from novel taxa remains a key challenge in metagenomic classification, as reference databases rarely capture the full extent of microbial diversity. We investigate the Naïve Bayes Classifier++ (NBC++) as an out-of-distribution (OOD) detector by analyzing its log-likelihood scores across simulated and real metagenomic datasets. By partitioning reference databases and introducing taxonomic novelty, we derive thresholds that distinguish known from unknown reads at multiple taxonomic levels. These thresholds remain consistent across database sizes, indicating that once a lineage is represented, novelty detection performance stabilizes. Applied to a human gut metagenome, the thresholds reflect differences in database density and classification confidence. This work characterizes how NBC++ responds to novelty and illustrates its use in evaluating unclassified metagenomic reads.

Related articles

Related articles are currently not available for this article.