An Automated Metabolomics Data Processing Workflow to Distinguish Metabolite Profiles

An Automated Metabolomics Data Processing Workflow to Distinguish Metabolite Profiles

Hayley Abbiss1,2,3, Joel P.A. Gummer2,4, Laura F. Grogan5,6, Robert D. Trengove2,4, John Moncur3, Scott J. Campbell3

1School of Veterinary and Life Sciences, Murdoch University, WA, Australia; 2Separation Science and Metabolomics Laboratory, Murdoch University, WA, Australia; 3SpectralWorks, Cheshire, United Kingdom; 4Metabolomics Australia, Western Australian Node, WA, Australia; 5One Health Research Group, College of Public Health, Medical and Veterinary Sciences, James Cook University, QLD, Australia; 6Griffith Wildlife Disease Ecology Group, Environmental Futures Research Institute, School of Environment, Griffith University, QLD, Australia

First Published: BMSS 2016, Eastbourne 2016


Metabolomics is the study of metabolites in the cell, tissue and/or biofluids. These sample types are complex mixtures, requiring complementary tools to capture a global profile. Fast, automated software solutions are required to handle the resultant large data sets for 1) interrogation of data quality and 2) interpretation of experimental data. AnalyzerPro® is a vendor neutral chromatographic deconvolution package with qualitative and quantitative processing capability including multiple data interrogation and interpretation features such as principal component analysis (with options for data transformation), univariate statistical analysis and graphical representations of retention time and signal drift.


Using a test set of biological samples spiked with authentic standards, SpectralWorks’ AnalyzerPro has been evaluated as a metabolomics data processing workflow as shown in figure 1.

A typical metabolomics workflow using AnalyzerPro

Figure 1. A typical metabolomics workflow using AnalyzerPro


Shimadzu GC-MS data was used. Each full-scan file was approximately 80 MB and took < 2 min to process including peak picking, deconvolution, RI calculation, library matching and data matrix generation.

Data Processing

Figure 2. Data Processing

Correction for signal drift, review RT reproducibility

Figure 3. Correction for signal drift, review RT reproducibility

Principal component analysis and univariate statistics are calculated on-the-fly as data is transformed and the matrix is updated accordingly. PCA is recalculated each time data is transformed and thresholds are applied (i.e. for p-value and fold-change). The matrix can be exported and saved at any time during the analysis.

Results view

Figure 4. Results view

There was a significant signal drift (P = 0.036, figure 3) giving rise to 48 significant components with signal correction. Without signal correction, 33 components were found to be significantly different (P < 0.05) between sample groups. Of these, 14 were unidentified, 9 were matched to the NIST library and automatically annotated, and 10 were matched to authentic standard spectra. An example of a NIST identified significant result is shown in figure 5.

Example results of a NIST identified component

Figure 5. Example results  of a NIST identified component


AnalyzerPro is a vendor neutral, elegant package for processing metabolomics data. The software is user-friendly and intuitive with fast processing speed and multiple functionalities appropriate for the interpretation of data derived from metabolomics-based experiments.