Learning millisecond protein dynamics from what is missing in NMR spectra

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Many proteins’ biological functions rely on interconversions between multiple conformations occurring at micro-to millisecond (µs-ms) timescales. A lack of standardized, large-scale experimental data has hindered obtaining a more predictive understanding of these motions. After curating >100 Nuclear Magnetic Resonance (NMR) relaxation datasets, we realized an observable for µs-ms motion was hiding in plain sight. Millisecond motions can cause NMR signals to broaden beyond detection, leaving some residues not assigned in the chemical shift datasets of ∼10,000 proteins deposited in the Biological Magnetic Resonance Data Bank (BMRB)1. We made the bold assumption that residues missing assignments are exchange-broadened due to µs-ms motions, and trained various deep learning models to predict missing assignments as markers for such dynamics. Strikingly, these models also predict µs-ms motion directly measured in NMR relaxation experiments. The best of these models, which we named Dyna-1, leverages an intermediate layer of the multimodal language model ESM-32. Notably, dynamics directly linked to biological function — including enzyme catalysis and ligand binding — are particularly well predicted by Dyna-1, which parallels our findings that residues with µs-ms motions are highly conserved. We anticipate the datasets and models presented here will be transformative in unlocking the common language of dynamics and function.

Related articles

Related articles are currently not available for this article.