Metabolomics and AnalyzerPro Mass Spectrometry

Metabolomics

What are the Benefits of AnalyzerPro?

AnalyzerPro is vendor neutral and can convert any vendor’s profile or centroid data format quickly. With fast data file processing time for noise reduction, peak picking and deconvolution, coupled with automated RI calculation (chromatographic alignment), internal standard normalization, library generation based on components in your samples and library searching (e.g. NIST/EPA/NIH mass spectral libraries), AnalyzerPro offers a complete pre- and post- processing solution to your untargeted and/or targeted metabolomics workflows through speed and automation. AnalyzerPro has a user friendly, intuitive, interface with a convenient, adjustable sample display to show the progress of your data processing sequence. With access to all information throughout processing, including the ability to view chromatograms, raw mass spectra and deconvoluted spectra, AnalyzerPro offers a processing workflow suitable for beginners and advanced MS acquisition experiments.

The Metabolomics Workflow

Research Question

The most important part of any metabolomics study is the specific biological question that the study is designed to answer. This question should be used to define the overall study and should be objective, measurable, realistic and relevant. For example: is there a difference in the secreted metabolome of Salmonella grown as a biofilm or in a planktonic form? Or: which metabolites are increased in the bloodstream following inflammation? Without this biological question it may be impossible to make sense of the data or to design the study appropriately. It is not sufficient to simply collect as much data as possible and then attempt to explain observations.

Experimental Design

It may seem obvious, but the study should be designed to answer the biological question. As we are dealing with biological processes it is critical that we have enough samples in the study to account for natural biological variation. The preference is for as many biological replicates as possible per treatment and control. The metabolomics Society recommends five replicates and the minimum number required for publication is three. The actual number of samples required for a given study will vary depending on a number of factors. For example, human studies can be particularly problematic due to the variation in human subjects and as such require the greatest number of samples.

Experimental design also includes how the MS data is going to be acquired. Enough sample must be collected to allow for pooled biological replicates. Preferably, sampling should be complete before acquisition commences so that acquisition order can be fully randomized or suitably stratified. It doesn’t hurt to randomize the order samples are prepared either if it is expected to take a long time. The pooled QC samples are important not only for instrument equilibration and to correct for drift but for processing using AnalyzerPro, too. Pooled QCs are useful for target component library generation, RI calculation, and metabolite identification as they are essentially a representation of your entire sample set.

In your MS experimental design, include sample collection and preparation blanks so that artefacts of sample preparation can be eliminated from your final feature list.

Sample Collection

Optimal sample collection methods are not always achievable; nevertheless samples must be collected in a consistent manner. If the samples contain cells which are still metabolically active, this metabolism must be quenched usually by snap-freezing or with cold solvents. Samples should be frozen and kept at as low a temperature as possible. However there is no consensus as to how long samples remain viable for metabolomics studies even when frozen at -80°C. In general, metabolomics studies are more time sensitive than the other ‘omics’ (proteomics and genomics).

Sample Preparation

The sample preparation required will vary depending on the analytical technique(s) chosen. Most methods require an extraction step to remove proteins and, if necessary, break open intact cells. The extraction stage can also be used to concentrate the analytes of interest and introduce a suitable internal standard. Analysis by GC-MS requires that the analytes are volatile and this may require the preparation of suitable derivatives. Sample preparation should also consider the requirements for long term sample storage, not forgetting that samples are usually best kept as frozen dried extracts.

Acquisition

It is important to consider how long samples are likely to remain viable as well as how frequently the analytical instrument requires maintenance and calibration when planning the frequency and positioning of replicates. Depending on the design of the metabolomic study, the MS data may be acquired in full scan, SIM or MRM mode. Data from multiple vendors can be imported into AnalyzerPro for data processing. RemoteAnalyzer® can be used to simplify and unify the interface for submitting samples for acquisition particularly when multiple vendors’ instruments are available.

Data File Conversion, Deconvolution and Feature Detection

Data from the analytical instrument needs to be processed before it can be analysed and interpreted. Our first step in data processing is called ‘deconvolution’ which is defined as ‘reversing the effects of convolution’. For chromatographic data the biggest problem is co-elution of chromatographic peaks. Many components will not be completely separated by the column. Using AnalyzerPro we can determine which masses belong to which of the co-eluting peaks. The deconvolution step also allows the creation of target component lists. For comprehensive data analysis automated library building is available in AnalyzerPro which includes all the components within an entire data set. This differs from a traditional target component list generated from standards in that the components do not need to be identified until later in the analysis. At this stage, it is only important that they are there. The list produced from the automated library building stage will be comprehensive and the component list will be very large. As such, there is a fundamental need for a detailed level of quality control. The upside of this is that your data set can be completely defined by the component list generated during the automated library building stage. Following deconvolution, each component found can be assigned an identity. This identity does not have to be absolute and subsequent analysis can be performed on identified components or by treating each component as an unknown. Components can be identified from libraries or from a specific target component library. A target component library is usually based on standards which have been run on in house instruments as this provides more confident identification. Absolute identity of all components is not required at this stage but where a component of interest is found (i.e. where a difference is observed) that component should be investigated further to confirm its identity.

Statistical Analysis

The most common way to make sense of large metabolomics data sets is by the use of unsupervised multivariate statistical analysis. Principal Component Analysis (PCA) allows the data to be modelled and to determine which metabolites [components] are contributing most to the variance that has been observed. It also allows the identification of outliers that are likely to contribute to the skewing of the variance of a particular component. PCA points you in the direction needed for interpretation, but is only one part of the data analysis workflow. The next step is to scrutinize those components contributing most to the variance. It is important to differentiate between components that may be artefacts or interferences which are of no biological interest (which can legitimately be removed or their variance accounted for) and those that are identified as causing variation in the data sets that require further investigation.

Interpretation of Results

At this point in the workflow, can we answer the biological question that was posed at the beginning of the study? Is there now a new question? In either case we have moved forward in our understanding, following this metabolomics workflow. Subsequent data analysis can involve the mapping of metabolites to pathways and determining trends of specific metabolites. However, we may have indeed found that we have answered our original question. Additionally, we may have also generated many other questions. In this situation, the process of metabolomics research begins again, starting with a new biological question and if necessary, revisiting the original data set. These stages are customary practice in the analysis of metabolomic data sets.

You may have identified a biomarker or handful of biomarkers relevant to your research question that now require validation. If the goal is to use these markers for quantitation and subsequent screening in populations, AnalyzerPro can be used for quantitative data processing and can do qualitative and quantitative processing simultaneously.

Conclusion

AnalyzerPro is a key element in this metabolomics workflow. The targeted and untargeted approach of the peak picking and deconvolution functionality is applicable to large data sets from multiple vendors’ instruments. The deconvolution of complex chromatographic separations is equally applicable to GC-MS and LC-MS analyses.

Learn how to use AnalyzerPro in your metabolomics workflow – read article.

Download our ‘AnalyzerPro in your Metabolomics Workflow’ brochure.