Machado: open source genomics data integration framework

This article has 1 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Background

Genome projects and multiomics experiments generate huge volumes of data that must be stored, mined and transformed into useful knowledge. All this information is supposed to be accessible and, if possible, browsable afterwards. Computational biologists have been dealing with this scenario for over a decade and have been implementing software libraries, toolkits, platforms, and databases to succeed in this matter. The GMOD’s (Generic Model Organism Database project) biological relational database schema, known asChado, is one of the few successful open source initiatives, it is widely adopted and many softwares are able to connect to it.

Results

We have been developing an open source software namedMachado(<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/lmb-embrapa/machado">https://github.com/lmb-embrapa/machado</ext-link>), a genomics data integration framework implemented inPython, to enable research groups to both store and browse, query, and visualize genomics data. The framework relies on theChadodatabase schema and, therefore, should be very intuitive for current developers to adopt it or have it running on the top of already existing databases. It has several data loading tools for genomics and transcriptomics data and also for annotation results from tools such asBLAST,InterproScan,OrthoMCLandLSTrAP. There is an API to connect toJBrowseand a web browsing visualisation tool is implemented usingDjango Views and Templates. TheHaystacklibrary integrated with theElasticSearchengine was used to implement a google-like search i.e. single auto-complete search box that provides fast results and incremental filters.

Conclusion

Machadoaims to be a modern object-relational framework that uses the latestsPythonlibraries to produce an effective open source resource for genomics research.

Related articles

Related articles are currently not available for this article.