R: mzmatch.ipeak.sort.MetAssign

mzmatch.ipeak.sort.MetAssign {mzmatch.R}

R Documentation

mzmatch.ipeak.sort.MetAssign

Description

Usage

mzmatch.ipeak.sort.MetAssign(JHeapSize=1425, i, o, basepeaks, ppm, p1, p0, alpha, alpha0, alpha1, numDraws, burnIn, databases, adducts, minDistributionValue, maxValues, rho, retentionTimeSD, retentionTimePredSD, identificationPeaks, filterPPM, rtClustering, seed, debug, debugOut, dbIdentOut, test, h, v)

Arguments

`JHeapSize`	Amount of RAM memory in megabytes, which should be reserved by the Java virtual machine. The default value is 1425 megabytes.
`i`	Option for the input file. The only allowed format is PeakML and when it is not set the input is read from standard in. The contents of the file is enforced to be peaksets (result of Combine) as this tool utilizes the full information of peaksets in order to identify related peaks.
`o`	Optional filename where the output is written. If this is not set the output is written to the standard output.
`basepeaks`	Optional filename where the output is written. If this is not set the file with the basepeaks is not written.
`ppm`	This should be set to parts-per-million accuracy of the MS equipment. The probability distributions over the theoretical peaks have been defined so that 95
`p1`	Parameter for mixture model clustering.
`p0`	Parameter for mixture model clustering.
`alpha`	Parameter for mixture model clustering.
`alpha0`	Parameter for BetaBeta mixture model clustering.
`alpha1`	Parameter for BetaBeta mixture model clustering.
`numDraws`	This parameter says how many posterior samples to take. Generally 200 samples should be sufficient for most uses. For data with very large numbers of peaks, more samples might be needed, perhaps up to 500.
`burnIn`	This parameters says how many initial samples should be discarded before saving posterior samples. For larger datasets, it is recommended to set this to 200.
`databases`	Parameter for optional compound databases to match against during mixture model clustering.
`adducts`	This is a comma separated list of adduct types that will be used in the generation of theoretical peaks. Whilst an exhaustive list can be provided, it is better to stick with those adducts that are known to be generated, as spurious adducts can generate more false positives.
`minDistributionValue`	The minimum probability mass that a mass needs to be kept in the distribution of the spectrum
`maxValues`	The maximum number of entries in a compound's spectrum
`rho`	The ratio of the maximum peak distribution to the baseline distribution
`retentionTimeSD`	This parameter describes the spread of LC/MS peaks that are generated by a single chromatographic peak and should be set so that the retention time of most of the peaks would be within 2 times this value, of the chromatographic retention time. This can vary widely because of difficulties in detecting accurate retention times from noisy peaks.
`retentionTimePredSD`	The spread of possible retention time values from a theoretical value
`identificationPeaks`	This parameter says to output compound identifications at support levels 1 to this number
`filterPPM`	This is an optimisation measure to speed up processing by ignoring peaks that are not closer than this value to a theoretical peak. Generally a value between 1.1 * ppm and 1.5 * ppm should be appropriate.
`rtClustering`	A flag to specify whether clustering using retention time should be used
`seed`	Random number generator seed
`debug`	Should debugging information be printed.
`debugOut`	Where debuging information is output to.
`dbIdentOut`	Where compound identification results are stored.
`test`	Options for testing purposes
`h`	When this is set, the help is shown.
`v`	When this is set, the progress is shown on the standard output.

Details

LC-MS experiments yield large amounts of peaks, many of which correspond to derivatives of peaks of interest (eg, isotope peaks, adducts, fragments, multiply charged molecules), termed here as related peaks. This tools identifies peaks by grouping them together and assigning these groupsto compounds that have been specified in a database. At the same time, the probability of theexistence of each of the database entries is given. The results of the peak identification stepare given as an annotation filteredIdentification on each peak of the output file. The results of the compound identification step are output

Value

This function returns no value.

Author(s)

Ronan Daly (Ronan.Daly@glasgow.ac.uk)

References

0. PeakML/mzMatch - a file format, Java library, R library and tool-chain for mass spectrometry data analysis. In preparation.