Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
OpenSwathWorkflow

Complete workflow to run OpenSWATH

This implements the OpenSwath workflow as described in Roest and Rosenberger et al. (2013) and provides a complete, integrated analysis tool without the need to run multiple tools consecutively.

It executes the following steps in order:

See below or have a look at the INI file (via "OpenSwathWorkflow -write_ini myini.ini") for available parameters and more functionality.

Input: SWATH maps and transition list

SWATH maps can be provided as mzML files, either as single file directly from the machine (this assumes that the SWATH method has 1 MS1 and then n MS2 spectra which are ordered the same way for each cycle). E.g. a valid method would be MS1, MS2 [400-425], MS2 [425-450], MS1, MS2 [400-425], MS2 [425-450] while an invalid method would be MS1, MS2 [400-425], MS2 [425-450], MS1, MS2 [425-450], MS2 [400-425] where MS2 [xx-yy] indicates an MS2 scan with an isolation window starting at xx and ending at yy. OpenSwathWorkflow will try to read the SWATH windows from the data, if this is not possible please provide a tab-separated list with the correct windows using the -swath_windows_file parameter.

Alternatively, a set of split files (n+1 mzML files) can be provided, each containing one SWATH map (or MS1 map).

Since the file size can become rather large, it is recommended to not load the whole file into memory but rather cache it somewhere on the disk using a fast-access data format. This can be specified using the -readOptions cache parameter (this is recommended!).

Output: Feature list and chromatograms

The output of the OpenSwathWorkflow is a feature list, either as FeatureXML or as tsv (use -out_features or -out_tsv) while the latter is more memory friendly. If you analyze large dataset, it is recommended to only use -out_tsv and not -out_features. For downstream analysis (e.g. using mProphet) also the -out_tsv format is recommended.

In addition, the extracted chromatograms can be written out using the -out_chrom parameter.

The command line parameters of this tool are:

OpenSwathWorkflow -- Complete workflow to run OpenSWATH
Version: 2.0.0 May 16 2015, 09:22:21, Revision: GIT-NOTFOUND

Usage:
  OpenSwathWorkflow <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option.

Options (mandatory options marked with '*'):
  -in <files>*                    Input files separated by blank (valid formats: 'mzML', 'mzXML')
  -tr <file>*                     Transition file ('TraML','tsv' or 'csv') (valid formats: 'traML', 'tsv', 
                                  'csv')
  -tr_type <type>                 Input file type -- default: determined from file extension or content
                                  (valid: 'traML', 'tsv', 'csv')
  -tr_irt <file>                  Transition file ('TraML') (valid formats: 'traML')
  -out_features <file>            Output file (valid formats: 'featureXML')
  -out_tsv <file>                 TSV output file (mProphet compatible)
  -rt_extraction_window <double>  Only extract RT around this value (-1 means extract over the whole range, 
                                  a value of 600 means to extract around +/- 300 s of the expected elution).
                                  (default: '600')
  -mz_extraction_window <double>  Extraction window used (in Thomson, to use ppm see -ppm flag) (default: 
                                  '0.05' min: '0')
  -ppm                            M/z extraction_window is in ppm
                                  
Common UTIL options:
  -ini <file>                     Use the given TOPP INI file
  -threads <n>                    Sets the number of threads allowed to be used by the TOPP tool (default: 
                                  '1')
  -write_ini <file>               Writes the default configuration file
  --help                          Shows options
  --helphelp                      Shows all options (including advanced)

The following configuration subsections are valid:
 - Scoring   Scoring parameters section

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
Have a look at the OpenMS documentation for more information.

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+OpenSwathWorkflowComplete workflow to run OpenSWATH
version2.0.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'OpenSwathWorkflow'
in[] Input files separated by blankinput file*.mzML,*.mzXML
tr transition file ('TraML','tsv' or 'csv')input file*.traML,*.tsv,*.csv
tr_type input file type -- default: determined from file extension or content
traML,tsv,csv
tr_irt transition file ('TraML')input file*.traML
rt_norm RT normalization file (how to map the RTs of this run to the ones stored in the library). If set, tr_irt may be omitted.input file*.trafoXML
swath_windows_file Optional, tab separated file containing the SWATH windows: lower_offset upper_offset \newline 400 425 \newline ... Note that the first line is a header and will be skipped.input file
sort_swath_mapsfalse Sort of input SWATH files when matching to SWATH windows from swath_windows_filetrue,false
use_ms1_tracesfalse Extract the precursor ion trace(s) and use for scoringtrue,false
out_features output fileoutput file*.featureXML
out_tsv TSV output file (mProphet compatible)
out_chrom Also output all computed chromatograms (chrom.mzML) outputoutput file*.mzML
min_upper_edge_dist0 Minimal distance to the edge to still consider a precursor, in Thomson
rt_extraction_window600 Only extract RT around this value (-1 means extract over the whole range, a value of 600 means to extract around +/- 300 s of the expected elution).
extra_rt_extraction_window0 Output an XIC with a RT-window that by this much larger (e.g. to visually inspect a larger area of the chromatogram)0:∞
mz_extraction_window0.05 Extraction window used (in Thomson, to use ppm see -ppm flag)0:∞
ppmfalse m/z extraction_window is in ppmtrue,false
min_rsq0.95 Minimum r-squared of RT peptides regression
min_coverage0.6 Minimum relative amount of RT peptides to keep
split_file_inputfalse The input files each contain one single SWATH (alternatively: all SWATH are in separate files)true,false
use_elution_model_scorefalse Turn on elution model score (EMG fit to peak)true,false
readOptionsnormal Whether to run OpenSWATH directly on the input data, cache data to disk first or to perform a datareduction step first. If you choose cache, make sure to also set tempDirectorynormal,cache
tempDirectory/tmp/ Temporary directory to store cached files for example
extraction_functiontophat Function used to extract the signaltophat,bartlett
batchSize0 The batch size of chromatograms to process (0 means to only have one batch, sensible values are around 500-1000)0:∞
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue,false
forcefalse Overwrite tool specific checks.true,false
testfalse Enables the test mode (needed for internal use only)true,false
+++ScoringScoring parameters section
stop_report_after_feature-1 Stop reporting after feature (ordered by quality; -1 means do not stop).
rt_normalization_factor100 The normalized RT is expected to be between 0 and 1. If your normalized RT has a different range, pass this here (e.g. it goes from 0 to 100, set this value to 100)
quantification_cutoff0 Cutoff in m/z below which peaks should not be used for quantification any more0:∞
write_convex_hullfalse Whether to write out all points of all features into the featureXMLtrue,false
++++TransitionGroupPicker
stop_after_feature-1 Stop finding after feature (ordered by intensity; -1 means do not stop).
min_peak_width14 Minimal peak width (s), discard all peaks below this value (-1 means no action).
recalculate_peakstrue Tries to get better peak picking by looking at peak consistency of all picked peaks. Tries to use the consensus (median) peak border if theof variation within the picked peaks is too large.
recalculate_peaks_max_z0.75 Determines the maximal Z-Score (difference measured in standard deviations) that is considered too large for peak boundaries. If the Z-Score is above this value, the median is used for peak boundaries (default value 1.0).
minimal_quality-1.5 Only if compute_peak_quality is set, this parameter will not consider peaks below this quality threshold
compute_peak_qualitytrue Tries to compute a quality value for each peakgroup and detect outlier transitions. The resulting score is centered around zero and values above 0 are generally good and below -1 or -2 are usually bad.
+++++PeakPickerMRM
sgolay_frame_length11 The number of subsequent data points used for smoothing.
This number has to be uneven. If it is not, 1 will be added.
sgolay_polynomial_order3 Order of the polynomial that is fitted.
gauss_width30 Gaussian width in seconds, estimated peak size.
use_gaussfalse Use Gaussian filter for smoothing (alternative is Savitzky-Golay filter)
peak_width-1 Force a certain minimal peak_width on the data (e.g. extend the peak at least by this amount on both sides) in seconds. -1 turns this feature off.
signal_to_noise0.1 Signal-to-noise threshold at which a peak will not be extended any more. Note that setting this too high (e.g. 1.0) can lead to peaks whose flanks are not fully captured.0:∞
remove_overlapping_peakstrue Try to remove overlapping peaks during peak pickingfalse,true
methodcorrected Which method to choose for chromatographic peak-picking (OpenSWATH legacy, corrected picking or Crawdad).legacy,corrected,crawdad
++++DIAScoring
dia_extraction_window0.05 DIA extraction window in Th.0:∞
dia_centroidedfalse Use centroded DIA data.true,false
dia_byseries_intensity_min300 DIA b/y series minimum intensity to consider.0:∞
dia_byseries_ppm_diff10 DIA b/y series minimal difference in ppm to consider.0:∞
dia_nr_isotopes4 DIA nr of isotopes to consider.0:∞
dia_nr_charges4 DIA nr of charges to consider.0:∞
peak_before_mono_max_ppm_diff20 DIA maximal difference in ppm to count a peak at lower m/z when searching for evidence that a peak might not be monoisotopic.0:∞
++++EMGScoring
max_iteration10 Maximum number of iterations using by Levenberg-Marquardt algorithm.
deltaRelError0.1
++++Scores
use_shape_scoretrue Use the shape score (this score measures the similarity in shape of the transitions using a cross-correlation)true,false
use_coelution_scoretrue Use the coelution score (this score measures the similarity in coelution of the transitions using a cross-correlation)true,false
use_rt_scoretrue Use the retention time score (this score measure the difference in retention time)true,false
use_library_scoretrue Use the library scoretrue,false
use_intensity_scoretrue Use the intensity scoretrue,false
use_nr_peaks_scoretrue Use the number of peaks scoretrue,false
use_total_xic_scoretrue Use the total XIC scoretrue,false
use_sn_scoretrue Use the SN (signal to noise) scoretrue,false
use_dia_scorestrue Use the DIA (SWATH) scorestrue,false
use_ms1_correlationfalse Use the correlation scores with the MS1 elution profilestrue,false
use_ms1_fullscanfalse Use the full MS1 scan at the peak apex for scoring (ppm accuracy of precursor and isotopic pattern)true,false

OpenMS / TOPP release 2.0.0 Documentation generated on Sat May 16 2015 16:13:42 using doxygen 1.8.9.1