Leveraging Natural Language Processing models to decode the dark proteome across the Animal Tree of Life

This article has 4 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Functional annotation is crucial in biology, but many protein-coding genes remain uncharacterized, especially in non-model organisms. FANTASIA (Functional ANnoTAtion based on embedding space SImilArity) integrates protein language models for large-scale functional annotation. Applied to ∼1,000 animal proteomes, it predicts functions to virtually all proteins, revealing previously uncharacterized functions that enhance our understanding of molecular evolution. FANTASIA is available on GitHub at<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/CBBIO/FANTASIA">https://github.com/CBBIO/FANTASIA</ext-link>.

Related articles

Related articles are currently not available for this article.