An Automated Metabolomics Data Processing Workflow to Distinguish Metabolite Profiles

Hayley Abbiss^1,2,3, Joel P.A. Gummer^2,4, Laura F. Grogan^5,6, Robert D. Trengove^2,4, John Moncur³, Scott J. Campbell³

¹School of Veterinary and Life Sciences, Murdoch University, WA, Australia; ²Separation Science and Metabolomics Laboratory, Murdoch University, WA, Australia; ³SpectralWorks, Cheshire, United Kingdom; ⁴Metabolomics Australia, Western Australian Node, WA, Australia; ⁵One Health Research Group, College of Public Health, Medical and Veterinary Sciences, James Cook University, QLD, Australia; ⁶Griffith Wildlife Disease Ecology Group, Environmental Futures Research Institute, School of Environment, Griffith University, QLD, Australia

First Published: BMSS 2016, Eastbourne 2016

Introduction

Metabolomics is the study of metabolites in the cell, tissue and/or biofluids. These sample types are complex mixtures, requiring complementary tools to capture a global profile. Fast, automated software solutions are required to handle the resultant large data sets for 1) interrogation of data quality and 2) interpretation of experimental data. AnalyzerPro® is a vendor neutral chromatographic deconvolution package with qualitative and quantitative processing capability including multiple data interrogation and interpretation features such as principal component analysis (with options for data transformation), univariate statistical analysis and graphical representations of retention time and signal drift.

Methods

Using a test set of biological samples spiked with authentic standards, SpectralWorks’ AnalyzerPro has been evaluated as a metabolomics data processing workflow as shown in figure 1.

Figure 1. A typical metabolomics workflow using AnalyzerPro

Results

Shimadzu GC-MS data was used. Each full-scan file was approximately 80 MB and took < 2 min to process including peak picking, deconvolution, RI calculation, library matching and data matrix generation.

Figure 2. Data Processing

Figure 3. Correction for signal drift, review RT reproducibility

Principal component analysis and univariate statistics are calculated on-the-fly as data is transformed and the matrix is updated accordingly. PCA is recalculated each time data is transformed and thresholds are applied (i.e. for p-value and fold-change). The matrix can be exported and saved at any time during the analysis.

Figure 4. Results view

There was a significant signal drift (P = 0.036, figure 3) giving rise to 48 significant components with signal correction. Without signal correction, 33 components were found to be significantly different (P < 0.05) between sample groups. Of these, 14 were unidentified, 9 were matched to the NIST library and automatically annotated, and 10 were matched to authentic standard spectra. An example of a NIST identified significant result is shown in figure 5.

Figure 5. Example results of a NIST identified component

Conclusion

AnalyzerPro is a vendor neutral, elegant package for processing metabolomics data. The software is user-friendly and intuitive with fast processing speed and multiple functionalities appropriate for the interpretation of data derived from metabolomics-based experiments.