Using AI to speed up NMR-based protein structure determination
 

Prof. Dr. Roland Riek, based at ETH Zürich in Switzerland, has a broad range of research interests, most of which involve the application of NMR to biological systems, and proteins in particular. But his latest project is rather different from the rest, due to its focus on how to accelerate structure determination itself, using artificial intelligence (AI) to streamline the processes of NMR data acquisition and analysis.

Like many people involved in nuclear magnetic resonance (NMR) spectroscopy, Prof. Dr. Roland Riek originally became interested in the technology because of its interdisciplinary nature. Having studied physics but being fascinated by biology, he took the only research field that at the time (the mid-1990s) straddled both fields – NMR. His diploma work in the group of Prof. K. Wüthrich at ETH Zürich, Switzerland, was a major motivation, he says: “They were deeply involved with developing the technique at that time, and their expertise was amazing. I knew at that point that I’d want to continue with NMR.”

And this enthusiasm continues today. “What I love about NMR,” says Prof. Riek, “is the versatility and resolution of the technique. The fact that you can study atomic-level interactions in protein structures with upwards of 200 amino acids…that’s fascinating. And although the processes involved in NMR are complicated, I love the challenge of establishing what the spectra reveal.”

The breadth of Prof. Riek’s research is also impressive and includes: solving so-called ‘multistate structures’ with sub-Angstrom resolution; understanding the mechanism of protein aggregation in Parkinson’s disease; investigating the enzymes involved in cancer development; improving NMR signal-to-noise ratios using chemically induced dynamic nuclear polarization (CIDNP); making strides with in-cell NMR techniques; and considering self-aggregating crystals as a possible mechanism for the origin of life.

And even though, day-to-day, he is busy supervising his research group of 15–20 researchers and contributing to the meetings that come with being a senior figure at ETH Zürich, he still loves the practical aspect of the technology: “About once a week, I visit our NMR facility to set up experiments and analyze the data,” he says.

Looking at proteins using NMR: Making movies versus taking snapshots

A key advantage of NMR, says Prof. Riek, is its ability to provide an understanding of how proteins behave at the atomic level: “It’s a bit like being able to generate a movie showing a factory in operation, but scaled down 10 billion times.”

This stands in contrast to other methods, Prof. Riek points out. “Methods such as cryo-electron microscopy and X-ray crystallography certainly have their value, but with those methods you’re simply getting snapshots in time – the proteins are either literally frozen in the matrix or are immobilized in a crystal. That means you have to do a lot of work to reconstruct the whole process.” In contrast, he says, NMR brings the unique ability to study proteins as they are moving around in solution, which opens up many opportunities to see how they work: “You can begin to understand how they fold, how they move, how they bind to other molecules – it’s an incredibly powerful and versatile method.”

New instruments, new methods, new insights

And this capability is driven by the availability of high-resolution instruments – including the Bruker 1.2 GHz NMR system installed at ETH Zürich in June 2020. Although his time on this instrument has been limited so far, Prof. Riek has been impressed by what he has seen: “I’ve been amazed by the peak resolution that’s possible with this instrument,” he says. “Even though I knew what to expect in theory, it was still a big surprise when I saw the spectra for the first time.”

The 1.2 GHz Bruker NMR spectrometer at ETH Zürich, the second system in the world to be installed, in June 2020.

 

Prof. Riek has plans to use the new equipment in two main ways. “Firstly, we’d like to develop new methods on these high-field systems – we’re always pushing the boundaries of NMR, and you never know where it might lead. The application might not be apparent straight away, but with NMR you can almost guarantee that, in due course, there will be a field of study that will benefit from a new method. I think that’s often the way with this open-ended, ‘blue sky’ type of research.”

“And secondly, we’re interested in advancing the understanding of biomolecules within cells, to develop new approaches to treating neurodegenerative disorders such as Parkinson’s disease. Once we understand what’s going on at the molecular level, we can approach these problems rationally, and ultimately develop new techniques to treat them.”

Using AI to break the information bottleneck

But with these research possibilities comes a problem – data. “Progress in biological NMR is hindered by data availability and processing time,” says Prof. Riek. “Considering data availability first, it’s notable that out of the many thousands of protein structures studied by NMR, only a tiny fraction have had the original datasets made available to other researchers. That is currently a huge unresolved problem in NMR.”

But the second difficulty – the time taken to run NMR experiments and analyze the results – is one he thinks he can tackle using the rapidly-developing field of artificial intelligence (AI).

“Currently,” he says, “it takes anything from six months to a few years to fully characterize a protein structure: taking all the measurements and analyzing all the data is very time-consuming and requires constant expert judgement.” This bottleneck holds back progress in the field, he thinks – and finding a solution to this is difficult. “It isn’t easy to speed up the process of acquiring NMR data.” he points out. “We’re limited by the need to culture biological media and prepare the samples for analysis, and then by the defined amount of time it takes to run NMR pulse sequences. But what we can do is use the time on the NMR instrument more effectively and streamline data analysis.”

Prof. Riek suggests that AI offers a route to achieving these two goals: “By training algorithms to assess the results as they’re generated, and then make automated changes to the experiments on-the-fly, we can save a lot of time, by only running those pulse sequences that are necessary for solving the structure. And because instrument time is expensive, we’d also save money in the process.” Prof. Riek’s work on this area is still at an early stage, but he’s confident that once initial results have been published, the benefits of this approach will be wide-reaching.

A revolution in data handling for biological NMR?

So, what does Prof. Riek think might be the impact of AI upon everyday protein structure determination using NMR?

He’s optimistic: “Just imagine if we could generate a protein structure following, say, only two weeks of data acquisition and five hours of analysis. That might sound far-fetched now, but I think we could realistically see that within 1-2 years. If so, I think it would revolutionize the way that biochemistry is done, by accelerating the investigation of highly complex biomolecules, and suddenly making possible what was previously impractical.”
And this links back to Prof. Riek’s earlier point about the unexpected applications of fundamental research. “You can’t necessarily predict what advances might emerge from a new methodology. In the field of NMR, maybe there’s a way of conducting experiments that show how individual water molecules interact with proteins. If so, would that be useful? Maybe it would – but we won’t know until we try.”
“I think it’s part of human nature to want to try new things, even if you can’t envisage an application immediately. In science, that means coming up with the methodology first, and seeing where it leads you.”

Collaboration and the future: From protein structures to protein dynamics.

Like many researchers in NMR, Prof. Riek’s work depends upon collaboration. “As well as working closely with Bruker’s scientists, we liaise with several groups who contribute to our research objectives from different angles, using different analytical methods. Proteins are inherently complex targets – especially once you add in the processes of aggregation and disaggregation implicated in neurodegenerative disorders. Then it’s essential to work with multiple researchers using different methods to extract the maximum information.”

This is a paradigm of working that he’s keen to promote: “We’re currently setting up a server that will enable us to offer an AI-based NMR service to collaborators – you upload your NMR data, and you get back the automatically-generated structure. It’s coming along nicely.”

With these and other advances made by Prof. Riek’s team, the future looks bright. “In the coming years, I think we’ll see the focus of most NMR research by determining, on the one hand, many structures of proteins per year and on the other hand, go for protein dynamics at atomic resolution by combining dynamics measurements with multi state protein structure determination – to obtain a complete picture of how proteins move and operate. This is a hugely exciting prospect.”

For more information about Prof. Riek’s research, please visit https://chab.ethz.ch/forschung/institute-und-laboratorien/LPC.html
To learn more about Bruker’s NMR instruments, please visit https://www.bruker.com/en/products-and-solutions/mr/nmr.html
 

Prof. Dr. Roland Riek

Professor of Physical Chemistry and head of the Bio-NMR group at the Department of Chemistry and Applied Biosciences at the Swiss Federal Institute of Technology (ETH) Zürich, where he has been since 2007. Prior to that, he was Director of the NMR facility at the Salk Institute for Biological Studies in La Jolla, California.