Software hunts for malignant mutations
December 2, 2009 |  by Greg Rienzi

Bert Vogelstein and Kenneth Kinzler, co-directors of the Ludwig Center at Johns Hopkins, liken themselves to detectives—only instead of hunting criminals, they hunt rogue cells. Vogelstein and Kinzler are among the pioneers in uncovering genetic mutations responsible for the onset and development of cancer. They work to better understand the DNA changes, or mistakes, in genetic instructions. In particular, they focus on the somatic mutations—or cell mutations—that reduce the activity of proteins that suppress tumors or hyperactivate proteins and thus make it easier for tumors to grow and spread. Cancer cells develop lots of mutations, Kinzler explains, but not all of them are relevant. Finding the 5 percent to 20 percent that are worth studying can be time consuming. It’s not unlike a detective’s need to narrow down a long list of suspects. “You need to know what suspects to investigate further and rule out others,” he says.

Now, a team of Johns Hopkins engineers has developed groundbreaking computer software that will help narrow in on those relevant suspects. Rachel Karchin, an assistant professor of biomedical engineering at the Whiting School of Engineering, supervised development of mathematical software that enables scientists to speed up significantly the hunt for cancer triggers. “The simple idea is to prioritize these mutations for researchers,” Karchin says.

The new computational method, called CHASM—short for Cancer-specific High-throughput Annotation of Somatic Mutations—can sift through thousands of newly discovered genetic mutations in cancer cells to highlight the DNA changes most likely to promote tumor growth and rule out those just along for the ride. The software was developed via the university’s Institute for Computational Medicine—a joint effort of the Whiting School and the School of Medicine that focuses on research aimed at identifying, analyzing, and comparing basic biological components and processes that regulate human disease.

Karchin, along with doctoral student Hannah Carter, tested CHASM on brain cancer DNA. The team put 600 potential brain cancer mutations through an algorithm that would classify them as either “drivers,” mutations likely to initiate tumor genesis or progression, or “passengers,” those present when a tumor forms but not involved in its development. The researchers used a machine-learning technique in which roughly 50 characteristics associated with known cancer-causing mutations were given numerical values and programmed into the system. Karchin and Carter then employed a classifier to help separate and rank the drivers. In this mathematical forest, hundreds of if-x-then-y “decision trees” consider each mutation and vote for whether it  is a driver or a passenger. The more driver votes a mutation receives, the more likely it’s a cancer trigger. The results of the study, co-authored with Kinzler, were published in the August 15 issue of the journal Cancer Research.

With the drivers identified, researchers can examine the functional consequences of these mutations to develop cancer-fighting therapies, such as mutation-specific drugs. “Cancer cells are competing with the normal cells for resources: space, blood, and nutrients. It’s survival of the fittest,” Karchin says. “Certain mutations can give cancer cells an advantage over their neighbors. But which ones are they? That is what we want to know.”

Kinzler takes his criminal analogy one step further: “This new software will help identify mutations that we can put up on the ‘post office wall.’ Researchers might see the picture—the mutation—at several different crime scenes, or gangs of them active in many cancers.”

To date, Karchin’s lab has helped score genetic mutations involved in pediatric brain cancer, chronic lymphatic leukemia, melanoma, and lung cancer. The method has since been adapted to rank the mutations that may be linked to other cancers, such as breast and colorectal.

Karchin admits that the machine learning, while useful, doesn’t give a perfect prediction. Just because a mutation has been seen a number of times in cancer cells, she says, that doesn’t mean for certain that it’s a driver mutation. “We are making the assumption that these recurrent mutations are drivers,” she says. “So there is some room for error.”

The actual error rate falls between 15 percent and 20 percent, but Carter says that the rate will only decrease as the system’s training set gets populated with more known drivers.

Karchin’s group is preparing to distribute free CHASM software tools to academic researchers worldwide. The software, Karchin says, will be used to determine what mutations should be further explored in cell culture and animal models. “Learning which mutations are the ‘drivers’ will bring us closer to the goal of personalized cancer treatments, in which therapies are selected based on the genetic profile of an individual’s tumor,” she says.