College of Engineering News
Weixiang Zhao and Cristina Davis
An Inspired New Way to Mine Data
Posted on: September 25, 2009
Researcher Weixiang Zhao, who works with Assistant Professor Cristina Davis in the Department of Mechanical and Aerospace Engineering, has discovered a novel approach to a difficult problem. Zhao proposed using an ant colony optimization algorithm to identify biomarkers in mass spectrometry data, something that has never been tried before. The findings from his and Davis’s investigations were published in a recent issue of Analytica Chimica Acta, a journal devoted to the quick publication of original research in analytical chemistry. An online magazine published by John Wiley & Sons —separationsNOW.com—also has highlighted the research.
Mass spectrometry (MS) is a chemical analysis and separation technique that detects the identities and quantities of atoms and molecules in a sample by determining their molecular weight. This process is used to explore a wide range of chemical landscapes from the environment, to the human body, to food. But MS necessarily yields immense deposits of complicated data that must then be mined to reveal information relevant to the problem at hand. Because of its complexity and non-linear character, such data is difficult to analyze.
“We needed an efficient and adaptive strategy for selecting significant features from such high dimensional data in order to reveal the most pertinent and important information,” Zhao says. “Mass spectrometry data is often contaminated by noise or meaningless information. An ant colony optimization algorithm seemed like a possible solution.”
An ant colony optimization (ACO) algorithm is a problem solving tool that mimics the swarm intelligence of ants randomly searching for food. Over time the ants, through trial and error, find and chemically mark the most efficient route to a food source. The algorithm, like the ant colony, uses positive feedback to identify optimal routes and patterns. Zhao had read a paper about the use of ACO algorithms in another discipline and was inspired to try this tool on mass spectrometry data.
Zhao believed that the ACO algorithm, when applied to the mechanics of a complicated system, had the potential to identify more efficiently and quickly the biomarkers of disease. Moreover, coupled with wavelet analysis, the approach would allow the researchers to not only classify the data, but also return to the raw information and pinpoint the pertinent markers for mechanism studies.
So, Zhao and Davis tested the algorithm on a large collection of publicly available, NIH-funded study data from ovarian cancer patients and a healthy cohort. They wanted to see if the ACO-selected wavelet coefficients or features were able to locate the biomarkers in the original MS data. The technique proved to be highly efficient. After running the ACO analysis 100 times, the team was able to accurately distinguish between the healthy and diseased groups with a 98.8% accuracy and were also able to identify important features and biomarkers that accounted for the difference. This new strategy for data analysis could ultimately lead to better, earlier diagnosis of disease.
Meanwhile, Davis and Zhao are collaborating with other researchers in applying the data analysis strategy to other more difficult problems. For example, they are working with UC Davis plant scientist Professor Abhaya Dandekar and metabolomics expert Professor Oliver Fiehn, to similarly analyze volatile compounds emitted by citrus trees that have been infected by a deadly pathogen, HLB, that is difficult to detect and can decimate mature orchards.
Davis, whose research focuses on developing novel sensors for disease detection and environmental quality monitoring, hopes to develop sensors that could go into the field, detect the presence of HLB, and allow timely removal of diseased trees before an entire orchard is lost.
The multidisciplinary aspect of this work is important in a number of ways, Davis says. “Most importantly, we can find inspiration from ideas in disciplines that are not our own, apply them to our area of research and make a leap forward,” she says. “Much of what scientists do is incremental, making progress over time. But it’s the leaps that push science forward. Zhao’s inspired idea to apply this algorithm to a new problem could be one of those leaps.”
