MS-GF+

MS-GF+ Documentation home

MS-GF+

ChangeLog

Usage: java -Xmx3500M -jar MSGFPlus.jar

-s SpectrumFile (*.mzML, *.mzXML, *.mgf, *.ms2, *.pkl or *_dta.txt)
   Spectra should be centroided (see below for MSConvert example). Profile spectra will be ignored.

-d DatabaseFile (*.fasta or *.fa or *.faa)

[-conf ConfigurationFile] (Configuration file path)
   Example parameter file is at https://github.com/MSGFPlus/msgfplus/blob/master/docs/examples/MSGFPlus_Params.txt
   Additional parameter files can be found at https://github.com/MSGFPlus/msgfplus/tree/master/docs/ParameterFiles

[-decoy DecoyPrefix] (Prefix for decoy protein names; Default: XXX)

[-o OutputFile (*.mzid)] (Default: [SpectrumFileName].mzid)

[-t PrecursorMassTolerance] (e.g. 2.5Da, 20ppm or 0.5Da,2.5Da; Default: 20ppm)
   Use a comma to define asymmetric values. 
   E.g. "-t 0.5Da,2.5Da" will set 0.5Da to the left (ObservedPepMass < TheoreticalPepMass) 
                              and 2.5Da to the right (ObservedPepMass > TheoreticalPepMass)

[-ti IsotopeErrorRange] (Range of allowed isotope peak errors; Default: 0,1)
   Takes into account the error introduced by choosing a non-monoisotopic peak for fragmentation.
   The combination of -t and -ti determines the precursor mass tolerance.
   E.g. "-t 20ppm -ti -1,2" tests abs(ObservedPepMass - TheoreticalPepMass - n * 1.00335Da) < 20ppm for n = -1, 0, 1, 2.

[-thread NumThreads] (Number of concurrent threads to be executed; Default: Number of available cores)

[-tasks NumTasks] (Override the number of tasks to use on the threads; Default: (internally calculated based on inputs))
   More tasks than threads will reduce the memory requirements of the search, but will be slower (how much depends on the inputs).
   1 <= tasks <= numThreads: will create one task per thread, which is the original behavior.
   tasks = 0: use default calculation - minimum of: (threads*3) and (numSpectra/250).
   tasks < 0: multiply number of threads by abs(tasks) to determine number of tasks (i.e., -2 means "2 * numThreads" tasks).
   One task per thread will use the most memory, but will usually finish the fastest.
   2-3 tasks per thread will use comparably less memory, but may cause the search to take 1.5 to 2 times as long.

[-verbose 0/1] (0: Report total progress only (Default), 1: Report total and per-thread progress/status)

[-tda 0/1] (0: Don't search decoy database (Default), 1: Search decoy database)

[-m FragmentMethodID] (0: As written in the spectrum or CID if no info (Default), 1: CID, 2: ETD, 3: HCD, 4: UVPD)

[-inst InstrumentID] (0: Low-res LCQ/LTQ (Default), 1: Orbitrap/FTICR/Lumos, 2: TOF, 3: Q-Exactive)

[-e EnzymeID] (0: Unspecific cleavage, 1: Trypsin (Default), 2: Chymotrypsin, 3: Lys-C, 4: Lys-N, 5: glutamyl endopeptidase, 6: Arg-C, 7: Asp-N, 8: alphaLP, 9: no cleavage)

[-protocol ProtocolID] (0: Automatic (Default), 1: Phosphorylation, 2: iTRAQ, 3: iTRAQPhospho, 4: TMT, 5: Standard)

[-ntt 0/1/2] (Number of Tolerable Termini; Default: 2)
   E.g. For trypsin, 0: non-tryptic, 1: semi-tryptic, 2: fully-tryptic peptides only.

[-mod ModificationFileName] (Modification file; Default: standard amino acids with fixed C+57; only if -mod is not specified)

[-minLength MinPepLength] (Minimum peptide length to consider; Default: 6)

[-maxLength MaxPepLength] (Maximum peptide length to consider; Default: 40)

[-minCharge MinCharge] (Minimum precursor charge to consider if charges are not specified in the spectrum file; Default: 2)

[-maxCharge MaxCharge] (Maximum precursor charge to consider if charges are not specified in the spectrum file; Default: 3)

[-n NumMatchesPerSpec] (Number of matches per spectrum to be reported; Default: 1)

[-addFeatures 0/1] (0: Output basic scores only (Default), 1: Output additional features)

[-ccm ChargeCarrierMass] (Mass of charge carrier; Default: mass of proton (1.00727649))

[-ignoreMetCleavage 0/1] (N-terminal methionine cleavage behavior; Default: 0)

[-maxMissedCleavages Count] (Exclude peptides with more than this number of missed cleavages from the search; Default: -1 (no limit))

[-numMods Count] (Maximum number of dynamic (variable) modifications per peptide; Default: 3)

[-allowDenseCentroidedPeaks 0/1] (Default: 0 (disabled); 1: (for mzML/mzXML input only) allows inclusion of spectra with high-density centroid data in the search)
   MS-GF+ checks the distance between consecutive peaks in the spectrum, and if the median distance is less than 50 ppm, they are considered profile spectra regardless of the value provided in mzML and mzXML files.
   This parameter allows overriding this check when the mzML/mzXML file says the spectrum is centroided.
      

Examples:

Example command (using a parameter file):

java -Xmx3500M -jar MSGFPlus.jar -s Dataset.mzML -d ProteinList.fasta -conf MSGFPlus_PartTryp_MetOx_20ppmParTol.txt

Example command (high-precision spectra, using arguments):

java -Xmx3500M -jar MSGFPlus.jar -s Dataset.mzML -d IPI_human_3.79.fasta -inst 1 -t 20ppm -ti -1,2 -ntt 2 -tda 1 -o PSMs.mzid

Example command (low-precision spectra):

java -Xmx3500M -jar MSGFPlus.jar -s Dataset.mzML -d IPI_human_3.79.fasta -inst 0 -t 0.5Da,2.5Da -ntt 2 -tda 1 -o PSMs.mzid

Parameters:

MS-GF+ output

MS-GF+ outputs results as an mzIdentML (version 1.1) file. See http://www.psidev.info/mzidentml/ for details on the mzIdentML format. For every PSM, MS-GF+ reports the following scores:

MS-GF+ output example

Shown below is a sample of the MS-GF+ output in table form, as extracted from a simple MzIdentML file: test.mzid

There are two options for converting an MS-GF+ output file (.mzid) into a tab-separated file (.tsv).

  1. The MzIDToTsv utility built into MSGFPlus.jar (see the MzIDToTsv page)
  2. The Mzid-To-Tsv-Converter standalone application, available on GitHub
#SpecFile SpecID ScanNum FragMethod Precursor IsotopeError PrecursorError(ppm) Charge Peptide Protein DeNovoScore MSGFScore SpecEValue EValue QValue PepQValue
test.mgf index=0 26559 CID 1285.3457 1 -5.049801 3 K.IGAYLFVDMAHVAGLIAAGVYPNPVPHAHVVTSTTHK.T test 299 244 1.4807088E-31 3.2871733E-29 0.0 0.0
test.mgf index=0 26559 CID 1285.3457 1 -5.049801 3 K.IGAYLFVDMAHVAGLIAAGVYPNPVPHAHVVTSTTHK.T test_isoform 299 244 1.4807088E-31 3.2871733E-29 0.0 0.0
test.mgf index=1 -1 CID 870.11743 0 0.14029178 3 K.NLANPTSVILASIQM+15.995LEYLGMADK.A test2 156 136 2.2559852E-22 4.4217308E-20 0.0 0.0
(Text file of this table: test_Unrolled.tsv)