Integrating theory and machine learning to reveal determinants of plasmid copy number

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Plasmids are extrachromosomal mobile genetic elements whose copy numbers (PCNs) critically influence microbial evolution, antibiotic resistance and pathogenicity. Despite their importance and immense diversity, the ecological, evolutionary and molecular factors determining PCN remain poorly understood. Here, we present a theoretical model to explain the empirical power-law relationship between plasmid size and copy number, one of the fundamental quantitative principles governing PCN control. However, this relationship alone has limited predictive power. To improve PCN prediction, we introduce a data-driven approach incorporating diverse features. Trained on >10,000 plasmids, our machine learning model achieves significantly enhanced accuracy, with plasmid-encoded protein domains emerging as key predictors. Applying this framework, we conduct the first comprehensive analysis of PCN distributions across hundreds of thousands of metagenomic plasmids (IMG/PR database) and tens of thousands of clinical isolates, uncovering niche specific taxonomic PCN hotspots and ecological adaptations. These results provide critical insights into plasmid ecology, ARG surveillance and shed lights on the gut plasmidome, a “dark matter” in human microbiome.

Related articles

Related articles are currently not available for this article.