BFVD - a large repository of predicted viral protein structures

This article has 16 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

The AlphaFold Protein Structure Database (AFDB) is the largest repository of accurately predicted structures with taxonomic labels. Despite providing predictions for over 214 million UniProt entries, the AFDB does not cover viral sequences, severely limiting their study. To bridge this gap, we created the Big Fantastic Virus Database (BFVD), a repository of 351,242 protein structures predicted by applying ColabFold to the viral sequence representatives of the UniRef30 clusters. BFVD holds a unique repertoire of protein structures as over 63% of its entries show no or low structural similarity to existing repositories. We demonstrate how BFVD substantially enhances the fraction of annotated bacteriophage proteins compared to sequence-based annotation using Bakta. In that, BFVD is on par with the AFDB, while holding nearly three orders of magnitude fewer structures. BFVD is an important virus-specific expansion to protein structure repositories, offering new opportunities to advance viral research. BFVD is freely available at<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://bfvd.steineggerlab.workers.dev/">https://bfvd.steineggerlab.workers.dev/</ext-link>

Related articles

Related articles are currently not available for this article.