Special Report
April 2017

Data-driven drugs

DOE-NCI project applies computational power to predict molecular targets for cancer therapies.

The proteins that make up the RAS subfamily pathway. Image courtesy of Elaine Meng / Wikimedia Commons.

This is the fourth and final installment in a series about the Department of Energy collaboration with the National Cancer Institute to use big data and high-performance computing in cancer research.

Combining advanced sensing systems and machine-learning algorithms allows a driverless vehicle to anticipate and avoid dangerous situations faster and more accurately than a human driver. To comprehend the biochemical signaling pathways that go awry in cancer, scientists are adapting similar approaches – applying the power of machine learning and simulation to decades of biomedical research.

High-performance computing is now powerful enough to address the daunting complexity that has stymied breakthrough treatments. That premise is behind a collaboration led by Fred Streitz of the Department of Energy’s (DOE) Lawrence Livermore National Laboratory and Dwight Nissley of the Frederick National Laboratory for Cancer Research (FNLCR), one of four pilot projects between DOE and the National Cancer Institute (NCI). Streitz, Nissley and colleagues aim to create a predictive molecular-scale model to accelerate diagnostic and targeted therapy.

Rick Stevens of the DOE’s Argonne National Laboratory, a co-principal investigator in the DOE-NCI collaborations, says the problem is so “complicated we don’t want people to be steering the simulation. We want the computer to steer itself. We wrap the simulation in machine learning so the computer will learn where it needs to do more work to reduce uncertainty in what’s happening in the pathway.”

The pathway in question here is called RAS/RAF. (RAS is a protein named for where it was first found, in rat sarcoma; RAF, for rapidly accelerated fibrosarcoma, refers to an enzyme.) Scientists have long understood that molecular-scale alterations in RAS/RAF help drive cancer in the lung, colon, pancreas and other organs – and that when RAS is involved, patients have a poor prognosis. Despite this understanding, therapies haven’t been able to target the cancer-causing form of RAS. The complex environmental and multiscale interactions that drive RAS-initiated cancer growth have led some scientists to conclude that RAS pathways are impossibly elusive targets for anti-cancer drugs.

But the research team steering this pilot project has decades of progress to help guide them. Scientists have made great strides in the past 30 years toward understanding RAS signaling pathways. Biological data is available from analyzing cellular genetic, protein and metabolic processes, from structural studies and from other sources.

‘We are building a new kind of molecular modeling tool.’

The idea is to apply multiscale, physics-based dynamic simulations, data analytics, machine learning and advanced computing architectures to go beyond the limitations of biological experiments, which can only partially reconstruct the tumor environment.

Nissley notes that “the RAS protein operates exclusively in the context of a cell membrane. If we’re to develop an understanding of oncogenic RAS behavior, we need to start there.” To capture that behavior, Streitz adds, the team plans to model RAS mechanics with a realistic simulated membrane.

The scientists believe that simulations spanning ranges of times and physical sizes will be necessary to assess the biological repercussions of RAS mutations. The models will be built to represent not only the cancer microenvironment but also other environmental and genetic factors specific to individual cancers. Creating simulations combined with experimental data allows the research team to include a vast amount of information that can be included only using high-performance analytics that remove human guesswork from the equation.

“We are building a new kind of molecular modeling tool that can in essence zoom in and out on the interactions,” Stevens says. “It can use fully atomistic molecular dynamics for fine detail or it can zoom out to coarse-grained molecular mechanics. It can go to where interesting things are happening in the pathway.”

Streitz says the team will build dynamic models of RAS protein biology in varying cellular membrane compositions and couple them with experiments at FNLCR. Simulations will exploit next-generation computing architectures in supercomputers arriving at Livermore, Argonne and the DOE’s Oak Ridge National Laboratory. The experiments will incorporate advanced imaging and interaction data from a range of tools that include cryo-EM, NMR, crystallography and neutron sources available at the FNLCR. The work will focus on membrane-bound RAS, mutated RAS and RAS complexes.

Once a dynamic model is built and validated against experiment, the team can begin to explore the many avenues of the RAS signaling pathway in search of possible drug targets.

“By wrapping a machine-learning framework around our modeling capability,” Streitz says, “we can anticipate spawning thousands or millions of simulations, each exploring an aspect of the pathway.”

By feeding this modeling information and new experimental data back into the machine-learning algorithm, the team hopes to map the pathways relevant for cancer initiation with RAS proteins – and thus point to possible drugs that target these pathways as they turn from normal to cancer-promoting.