Untargeted Metabolomics for Marine Pollution Analysis Using GC¬–MS

June 18, 2015

Marine polychaetes are a common type of annelid worm widely spread in marine environments. Raquel Fernandez, University of Copenhagan, spoke to LCGC about her innovative approach to developing an untargeted method to monitor polychaetes and assess their potential use in environmental monitoring of oil spills.

Marine polychaetes are a common type of annelid worm widely spread in marine environments. Raquel Fernandez, University of Copenhagan, spoke to LCGC about her innovative approach to developing an untargeted method to monitor polychaetes and assess their potential use in environmental monitoring of oil spills.

 

You recently developed a gas chromatography–mass spectrometry (GC–MS) metabolomics platform to investigate marine polychaetes. Why are you investigating marine polychaetes?

The idea arose after my PhD, which dealt with monitoring and characterizing the oil spill that resulted from the Prestige oil tanker disaster off the coast of Galicia (North-West Spain) in 2002. I then applied for an IEF Marie Curie Fellowship with Prof. J. H. Christensen at University of Copenhagen (Copenhagen, Denmark).

One of the requirements of the Fellowship was that that the project had to be innovative, and despite the fact that marine polychaetes are ubiquitous in the marine environment, no studies had been performed to investigate the potential of these organisms for biological monitoring. My objective was to examine how hydrocarbon pollution and other environmental stressors control the metabolic response at molecular level and evaluate their use as pollution indicators. If a response could be determined, the limited mobility and tolerance to polluted conditions of marine polychaetes would make them an excellent candidate for biological monitoring.

 

What were the limitations of existing analytical tools when developing this method?

The main limitation was that there was no analytical platform that could cover all the compounds. This was, and still is, especially true for untargeted studies in which the interest can span multiple classes of metabolites. When the study began we looked at the existing analytical platforms and after an initial screening, we first thought that ultrahigh-performance liquid chromatography–mass spectrometry (UHPLC–MS) would be the best candidate for our purposes.

However, the data analysis and the preprocessing turned out to be especially problematic because the shift pattern in retention time and sheer number of samples proved to be too complex a problem to be adequately corrected using the bioinformatics tools we had at the time. Hence, we decided to move from liquid chromatography (LC) to gas chromatography (GC), because GC tends to be more stable retention time wise. This implied that the separation criteria and targeted class of compounds of primary interest also shifted from more water-soluble to more hydrophobic compounds. The unsuitability of GC for nonvolatile or thermally unstable compounds also required a suitable derivatization scheme before we could proceed with the GC analysis.

 

What is novel about your approach and what challenges did you have to overcome from an analytical perspective?

The most novel aspect is that we presented an entire method for environmental metabolomics in marine worms from sample preparation and extraction, to chemical analysis, data pre-processing, and multivariate methods. We included all the steps needed to perform a good untargeted analysis. One, if not the biggest, challenge we had to face was the optimization of the parameters for the preprocessing using XCMS. XCMS is free open source software to process metabolomics data that includes feature detection, retention time alignment, and feature matching across samples to generate a consensus table of features common to a majority of the samples. The package is mature and it has been successfully used in many metabolomics studies. This software suite was developed initially to pre-process LC–MS based data, and to our knowledge, it is used predominantly for this purpose. However, with some optimizations to the parameters, it can be successfully used for treating GC–MS based metabolomics data, as we show in our approach.

We worked for days on testing the different algorithms available in XCMS - for feature detection, peak matching, and retention time alignment; different combinations of parameters; and even developed in-house a number of additional software tools and diagnostics to check the results - but yet, we could not find a combination that worked for all the samples at once and satisfied us entirely. In fact, the difficulties in preprocessing were the main reason why we shifted to GC in the first place.

 

 

 

What were your main findings?

The main finding was that a specific metabolite pattern variation in marine polychaetes could be correlated with oil exposure, meaning that metabolite patterns could serve as oil pollution indicators in the future. However, no biological variation was taken into account in this study and only the remaining data we collected during the project could throw some light on this. Another result that I really think is important, though it might not appear as such at first glance, is that we really understood the importance of the incorporation of QC samples (for example, a pooled sample) not only for the chemical analysis, but also for the quality of the preprocessing: to reduce analytical variation and to quantitatively determine analytical precision.

Finally, we showed how the normalization of the analytical profiles is essential to the proper profiling of metabolic data; we compared three different approaches to investigate which one is the most suitable for our particular case.

 

Are there any specific problems encountered when developing an untargeted GC–MS method?

The main problem in untargeted GC–MS metabolomics is achieving a well‑resolved chromatographic separation of a large number of metabolite species with different chemical and physical properties in a single analytical run before performing the data preprocessing. Also, the very heterogeneity of the data made the optimization and validation of the parameters for the data preprocessing especially difficult, yet essential to the whole method. And, again, it was only by using the QC samples that we achieved our results. The comparison of the XCMS results against manual integration of selected peaks relative to different classes of compounds showed an excellent correlation. It is evident that XCMS offers optimal choice of parameters to provide an accurate feature extraction, comparable with manual peak integration.

 

Is there any general advice you would offer to chromatographers attempting to develop untargeted methods?

One of the biggest issues in metabolomics is reproducibility. Often the technical variation is high, despite the fact that no biological variation could be observed in this study. We managed to show the need to include a large number of QC samples in each batch or sequence. Even though many pointed this out, I believe this cannot be stressed enough! Another important point is not to underestimate the power and the effect of proper data preprocessing. Also, in general, it can be extremely time consuming. For space reasons, it may not take too much space in published works, and it often looks simpler and easier that it actually is: it is not “point-and-click”! It requires knowledge about your samples, right-thinking choices, and awareness of the consequences it may have on the data analysis. In this respect, I think, the third and most important: always go back to the original data and double-check that what you get from the data analysis shows in the raw data.

 

Are you developing this project further?

I believe that the method we developed can be used in real-life studies. This study was limited by the experimental design, which did not allow us to distinguish biological variation. We have additional data where this aspect was taken into account; this data seem to show that we could separate marine polychaetes exposed to different levels of contamination. The data analysis though is not complete and we need to thoroughly validate the results.

The plan is to write them into one or two additional papers; but my current involvement in other projects in microbial metabolomics at Chr Hansen A/S (Denmark) makes it difficult to do it right now. 

 

Anything else you would like to add?

Metabolomics is a multidisciplinary field - it requires a combination of experimental design, analytical chemistry, chemical and biological aspects of the metabolome, ecotoxicologic aspects, data preprocessing, and multivariate analysis techniques. It can be fascinating and challenging at the same time, even overwhelming, yet it is a great promising field that can shed quite some light and help us unravel many biological and environmental processes.

 

 

Raquel Fernández Varela has a M.Sc. and a Ph.D. degree in Analytical Chemistry, both from University of A Corunna (Spain). She moved to Denmark in 2011 with a Marie Curie Intra European Fellowship to performed postdoctoral studies at the Department of Plant and Environmental Sciences, University of Copenhagen (Denmark). Her main research at that point was to examine how pollution and other environmental stressors control metabolite response in selected organisms at molecular level.  In September 2014 she moved to Chr Hansen A/S (Denmark), where she is currently working as research scientist in microbial metabolomics. She has extensive experience in in gas chromatography, mass spectrometry, applied multivariate methods and environmental and microbial metabolomics.