The landscape of biomedical research
Abstract
The number of publications in biomedicine and life sciences has rapidly grown over the last decades, with over 1.5 million papers now being published every year. This makes it difficult to keep track of new scientific works and to have an overview of the evolution of the field as a whole. Here we present a 2D map of the entire corpus of biomedical literature, and argue that it provides a unique and useful overview of the life sciences research. We based our atlas on the abstract texts of 21 million English articles from the PubMed database. To embed the abstracts into 2D, we used the large language model PubMedBERT, combined witht-SNE tailored to handle samples of our size. We used our atlas to study the emergence of the Covid-19 literature, the evolution of the neuroscience discipline, the uptake of machine learning, the distribution of gender imbalance in academic authorship, and the distribution of retracted paper mill articles. Furthermore, we present an interactive web version of our atlas that allows easy exploration and will enable further insights and facilitate future research.
Related articles
Related articles are currently not available for this article.