Metabolomics 2011 In Press, DOI 10.1007/s11306-011-0341-0

Separating the wheat from the chaff: a prioritisation pipeline for the analysis of metabolomics datasets


Andris Jankevics1,2, M. Elena Merlo1,3, Marcel de Vries4, Roel J. Vonk4, Eriko Takano3 and Rainer Breitling1,2*

1Groningen Bioinformatics Centre, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Nijenborgh 7, 9747 AG Groningen, The Netherlands.
2Institute of Molecular, Cell and Systems Biology, College of Medical, Veterinary and Life Sciences, University of Glasgow, Joseph Black Building B3.10, G11 8QQ Glasgow, United Kingdom.
3Microbial Physiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Nijenborgh 7, 9747 AG Groningen, The Netherlands.
4Centre for Medical Biomics, University Medical Centre Groningen, 9713 AV Groningen, Netherlands
*Address correspondence to: Rainer Breitling at Institute of Molecular, Cell and Systems Biology, College of Medical, Veterinary and Life Sciences, University of Glasgow, Joseph Black Building B3.10, G11 8QQ Glasgow, United Kingdom, e-mail: rainer.breitling@glasgow.ac.uk

Supplementary materials

Data files

Design of experiment (assigned sample names, taxonomy etc.). ASCII file (Tab-separated values)

RAW data in mzXML format:

Standards mixture: C18, HILIC
Biological sample: C18, HILIC

R code

R file illustrating the complete pipeline of data processing. It includes settings and parameters for CentWave and mzMatch and all other post-treatment steps. Download

Peak tables

ASCII data tables (including signal intensity for all samples) and peakml data files for the annotated peaks. The peakML viewer can be used to explore contents of peakml files and its available for download in the files menu hidden in one of the folders. Description of the tool can be found here.

Identifiers within ScoCyc data base

Standards mixture, C18: tsv file, peakml file
Standards mixture, HILIC: tsv file, peakml file
Biological sample, C18: tsv file, peakml file
Biological sample, HILIC: tsv file, peakml file

Identifiers within KEGG data base

Standards mixture, C18: tsv file, peakml file
Standards mixture, HILIC: tsv file, peakml file
Biological sample, C18: tsv file, peakml file
Biological sample, HILIC: tsv file, peakml file

Legend for column names in the ASCII files:

Mass - measured accurate mass
RT - retention time
identification - putative identifications (comma separated for multiple values)
ppm - Predicted chemical formula delta in parts per million
corr.l - correlation value obtained from dilution filter
relation.ship - annotations for related peaks (eg, isotope peaks, adducts, fragments, multiply charged molecules)
BPS - peaks annotated as base peaks
Number.of.isobaric.masses - Number of peaks with different retention time, but the same mass

Design based on the SWT pages. Get mzmatch at SourceForge.net. Fast, secure and Free Open Source software downloads