HAMAP rules as SPARQL A portable annotation pipeline for genomes and proteomes

This article has 1 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Motivation

Genome and proteome annotation pipelines are generally custom built and therefore not easily reusable by other groups, which leads to duplication of effort, increased costs, and suboptimal results. One cost-effective way to increase the data quality in public databases is to encourage the adoption of annotation standards and technological solutions that enable the sharing of biological knowledge and tools for genome and proteome annotation.

Results

We have translated the rules of our HAMAP proteome annotation pipeline to queries in the W3C standard SPARQL 1.1 syntax and applied them with two off-the-shelf SPARQL engines to UniProtKB/Swiss-Prot protein sequences described in RDF format. This approach is applicable to any genome or proteome annotation pipeline and greatly simplifies their reuse.

Availability

HAMAP SPARQL rules and documentation are freely available for download from the HAMAP FTP site <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="ftp://ftp.expasy.org/databases/hamap/hamapsparql.tar.gz">ftp://ftp.expasy.org/databases/hamap/hamapsparql.tar.gz</ext-link> under a CC-BY-ND 4.0 license. The annotations generated by the rules are under the CC-BY 4.0 license.

Contact

<email>hamap@sib.swiss</email>

Supplementary information

Supplementary data are included at the end of this document.

Related articles

Related articles are currently not available for this article.