Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
Classes | Public Member Functions | Protected Member Functions | Private Member Functions | Private Attributes | List of all members
PepXMLFile Class Reference

Used to load and store PepXML files. More...

#include <OpenMS/FORMAT/PepXMLFile.h>

Inheritance diagram for PepXMLFile:
XMLHandler XMLFile

Classes

struct  AminoAcidModification
 

Public Member Functions

 PepXMLFile ()
 Constructor. More...
 
virtual ~PepXMLFile ()
 Destructor. More...
 
void load (const String &filename, std::vector< ProteinIdentification > &proteins, std::vector< PeptideIdentification > &peptides, const String &experiment_name, const MSExperiment<> &experiment, bool use_precursor_data=false)
 Loads peptide sequences with modifications out of a PepXML file. More...
 
void load (const String &filename, std::vector< ProteinIdentification > &proteins, std::vector< PeptideIdentification > &peptides, const String &experiment_name="")
 load function with empty defaults for some parameters (see above) More...
 
void store (const String &filename, std::vector< ProteinIdentification > &protein_ids, std::vector< PeptideIdentification > &peptide_ids)
 Stores idXML as PepXML file. More...
 
- Public Member Functions inherited from XMLFile
 XMLFile ()
 Default constructor. More...
 
 XMLFile (const String &schema_location, const String &version)
 Constructor that sets the schema location. More...
 
virtual ~XMLFile ()
 Destructor. More...
 
bool isValid (const String &filename, std::ostream &os=std::cerr)
 Checks if a file validates against the XML schema. More...
 
const StringgetVersion () const
 return the version of the schema More...
 

Protected Member Functions

virtual void endElement (const XMLCh *const , const XMLCh *const , const XMLCh *const qname)
 Docu in base class. More...
 
virtual void startElement (const XMLCh *const , const XMLCh *const , const XMLCh *const qname, const xercesc::Attributes &attributes)
 Docu in base class. More...
 
- Protected Member Functions inherited from XMLHandler
bool equal_ (const XMLCh *a, const XMLCh *b)
 Returns if two xerces strings are equal. More...
 
void writeUserParam_ (const String &tag_name, std::ostream &os, const MetaInfoInterface &meta, UInt indent) const
 Writes the content of MetaInfoInterface to the file. More...
 
Int asInt_ (const String &in)
 Conversion of a String to an integer value. More...
 
Int asInt_ (const XMLCh *in)
 Conversion of a Xerces string to an integer value. More...
 
UInt asUInt_ (const String &in)
 Conversion of a String to an unsigned integer value. More...
 
double asDouble_ (const String &in)
 Conversion of a String to a double value. More...
 
float asFloat_ (const String &in)
 Conversion of a String to a float value. More...
 
bool asBool_ (const String &in)
 Conversion of a string to a boolean value. More...
 
DateTime asDateTime_ (String date_string)
 Conversion of a xs:datetime string to a DataTime value. More...
 
char * attributeAsString_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a String. More...
 
Int attributeAsInt_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a Int. More...
 
DoubleReal attributeAsDouble_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a DoubleReal. More...
 
DoubleList attributeAsDoubleList_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a DoubleList. More...
 
IntList attributeAsIntList_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to an IntList. More...
 
StringList attributeAsStringList_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to an StringList. More...
 
bool optionalAttributeAsString_ (String &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the String value if the attribute is present. More...
 
bool optionalAttributeAsInt_ (Int &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the Int value if the attribute is present. More...
 
bool optionalAttributeAsUInt_ (UInt &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the UInt value if the attribute is present. More...
 
bool optionalAttributeAsDouble_ (DoubleReal &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the DoubleReal value if the attribute is present. More...
 
bool optionalAttributeAsDoubleList_ (DoubleList &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the DoubleList value if the attribute is present. More...
 
bool optionalAttributeAsStringList_ (StringList &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the StringList value if the attribute is present. More...
 
bool optionalAttributeAsIntList_ (IntList &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the IntList value if the attribute is present. More...
 
char * attributeAsString_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a String. More...
 
Int attributeAsInt_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a Int. More...
 
DoubleReal attributeAsDouble_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a DoubleReal. More...
 
DoubleList attributeAsDoubleList_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a DoubleList. More...
 
IntList attributeAsIntList_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a IntList. More...
 
StringList attributeAsStringList_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a StringList. More...
 
bool optionalAttributeAsString_ (String &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the String value if the attribute is present. More...
 
bool optionalAttributeAsInt_ (Int &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the Int value if the attribute is present. More...
 
bool optionalAttributeAsUInt_ (UInt &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the UInt value if the attribute is present. More...
 
bool optionalAttributeAsDouble_ (DoubleReal &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the DoubleReal value if the attribute is present. More...
 
bool optionalAttributeAsDoubleList_ (DoubleList &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the DoubleList value if the attribute is present. More...
 
bool optionalAttributeAsIntList_ (IntList &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the IntList value if the attribute is present. More...
 
bool optionalAttributeAsStringList_ (StringList &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the StringList value if the attribute is present. More...
 
SignedSize cvStringToEnum_ (const Size section, const String &term, const char *message, const SignedSize result_on_error=0)
 
 XMLHandler (const String &filename, const String &version)
 Default constructor. More...
 
virtual ~XMLHandler ()
 Destructor. More...
 
void reset ()
 Release internal memory used for parsing (call. More...
 
void fatalError (ActionMode mode, const String &msg, UInt line=0, UInt column=0) const
 Fatal error handler. Throws a ParseError exception. More...
 
void error (ActionMode mode, const String &msg, UInt line=0, UInt column=0) const
 Error handler for recoverable errors. More...
 
void warning (ActionMode mode, const String &msg, UInt line=0, UInt column=0) const
 Warning handler. More...
 
virtual void characters (const XMLCh *const chars, const XMLSize_t length)
 Parsing method for character data. More...
 
virtual void writeTo (std::ostream &)
 Writes the contents to a stream. More...
 
String errorString ()
 Returns the last error description. More...
 
void fatalError (const xercesc::SAXParseException &exception)
 
void error (const xercesc::SAXParseException &exception)
 
void warning (const xercesc::SAXParseException &exception)
 
- Protected Member Functions inherited from XMLFile
void parse_ (const String &filename, XMLHandler *handler)
 Parses the XML file given by filename using the handler given by handler. More...
 
void save_ (const String &filename, XMLHandler *handler) const
 Stores the contents of the XML handler given by handler in the file given by filename. More...
 
void enforceEncoding_ (const String &encoding)
 
 XMLFile ()
 Default constructor. More...
 
 XMLFile (const String &schema_location, const String &version)
 Constructor that sets the schema location. More...
 
virtual ~XMLFile ()
 Destructor. More...
 
bool isValid (const String &filename, std::ostream &os=std::cerr)
 Checks if a file validates against the XML schema. More...
 
const StringgetVersion () const
 return the version of the schema More...
 

Private Member Functions

void makeScanMap_ ()
 Fill scan_map_. More...
 
void readRTMZCharge_ (const xercesc::Attributes &attributes)
 Read RT, m/z, charge information from attributes of "spectrum_query". More...
 
void matchModification_ (const DoubleReal mass, const String &origin, String &modification_description)
 find modification name given a modified AA mass More...
 

Private Attributes

std::vector
< ProteinIdentification > * 
proteins_
 Pointer to the list of identified proteins. More...
 
std::vector
< PeptideIdentification > * 
peptides_
 Pointer to the list of identified peptides. More...
 
const MSExperimentexperiment_
 Pointer to the experiment from which the pepXML file was generated. More...
 
String exp_name_
 Name of the associated experiment (filename of the data file, extension will be removed) More...
 
String search_engine_
 Set name of search engine. More...
 
bool use_precursor_data_
 Get RT and m/z for peptide ID from precursor scan (should only matter for RT)? More...
 
std::map< Size, Sizescan_map_
 Mapping between scan number in the pepXML file and index in the corresponding MSExperiment. More...
 
DoubleReal rt_tol_
 Retention time and mass-to-charge tolerance. More...
 
DoubleReal mz_tol_
 
Element hydrogen_
 Hydrogen data (for mass types) More...
 
bool wrong_experiment_
 Do current entries belong to the experiment of interest (for pepXML files that bundle results from different experiments)? More...
 
bool seen_experiment_
 Have we seen the experiment of interest at all? More...
 
bool checked_base_name_
 Have we checked the "base_name" attribute in the "msms_run_summary" element? More...
 
std::vector< std::vector
< ProteinIdentification >
::iterator > 
current_proteins_
 References to currently active ProteinIdentifications. More...
 
ProteinIdentification::SearchParameters params_
 Search parameters of the current identification run. More...
 
ProteinIdentification::DigestionEnzyme enzyme_
 Enyzme associated with the current identification run. More...
 
PeptideIdentification current_peptide_
 PeptideIdentification instance currently being processed. More...
 
PeptideHit peptide_hit_
 PeptideHit instance currently being processed. More...
 
String current_sequence_
 Sequence of the current peptide hit. More...
 
DoubleReal rt_
 RT and m/z of current PeptideIdentification. More...
 
DoubleReal mz_
 
Int charge_
 Precursor ion charge. More...
 
UInt search_id_
 ID of current search result. More...
 
String prot_id_
 Identifier linking PeptideIdentifications and ProteinIdentifications. More...
 
DateTime date_
 Date the pepXML file was generated. More...
 
DoubleReal hydrogen_mass_
 Mass of a hydrogen atom (monoisotopic/average depending on case) More...
 
std::vector< std::pair< String,
Size > > 
current_modifications_
 The modifications of the current peptide hit (position is 1-based) More...
 
std::vector
< AminoAcidModification
fixed_modifications_
 Fixed aminoacid modifications. More...
 
std::vector
< AminoAcidModification
variable_modifications_
 Variable aminoacid modifications. More...
 

Additional Inherited Members

- Protected Types inherited from XMLHandler
enum  ActionMode { LOAD, STORE }
 Action to set the current mode (for error messages) More...
 
- Protected Attributes inherited from XMLHandler
String error_message_
 Error message of the last error. More...
 
String file_
 File name. More...
 
String version_
 Schema version. More...
 
StringManager sm_
 Helper class for string conversion. More...
 
std::vector< Stringopen_tags_
 Stack of open XML tags. More...
 
std::vector< std::vector
< String > > 
cv_terms_
 Array of CV term lists (one sublist denotes one term and it's children) More...
 
- Protected Attributes inherited from XMLFile
String schema_location_
 XML schema file location. More...
 
String schema_version_
 Version string. More...
 
String enforced_encoding_
 Encoding string that replaces the encoding (system dependend or specified in the XML). Disabled if empty. Used as a workaround for XTandem output xml. More...
 

Detailed Description

Used to load and store PepXML files.

This class is used to load and store documents that implement the schema of PepXML files.

Constructor & Destructor Documentation

Constructor.

virtual ~PepXMLFile ( )
virtual

Destructor.

Member Function Documentation

virtual void endElement ( const XMLCh *  const,
const XMLCh *  const,
const XMLCh *const  qname 
)
protectedvirtual

Docu in base class.

Reimplemented from XMLHandler.

void load ( const String filename,
std::vector< ProteinIdentification > &  proteins,
std::vector< PeptideIdentification > &  peptides,
const String experiment_name,
const MSExperiment<> &  experiment,
bool  use_precursor_data = false 
)

Loads peptide sequences with modifications out of a PepXML file.

Parameters
filenamePepXML file to load
proteinsProtein identification output
peptidesPeptide identification output
experiment_nameExperiment file name, which is used to extract the corresponding search results from the PepXML file.
experimentMS run to extract the retention times from (PepXML may contain only scan numbers).
use_precursor_dataUse m/z and RT of the precursor (instead of the RT of the MS2 spectrum) for the peptide?
Exceptions
Exception::FileNotFoundis thrown if the file could not be opened
Exception::ParseErroris thrown if an error occurs during parsing
void load ( const String filename,
std::vector< ProteinIdentification > &  proteins,
std::vector< PeptideIdentification > &  peptides,
const String experiment_name = "" 
)

load function with empty defaults for some parameters (see above)

Exceptions
Exception::FileNotFoundis thrown if the file could not be opened
Exception::ParseErroris thrown if an error occurs during parsing
void makeScanMap_ ( )
private

Fill scan_map_.

void matchModification_ ( const DoubleReal  mass,
const String origin,
String modification_description 
)
private

find modification name given a modified AA mass

Matches a mass of a modified AA to a mod in our modification db For ambigious mods, the first (arbitrary) is returned If no mod is found an error is issued and the return string is empty

Note
A duplicate of this function is also used in ProtXMLFile
Parameters
massModified AA's mass
originAA one letter code
modification_description[out] Name of the modification, e.g. 'Carboxymethyl (C)'
void readRTMZCharge_ ( const xercesc::Attributes &  attributes)
private

Read RT, m/z, charge information from attributes of "spectrum_query".

virtual void startElement ( const XMLCh *  const,
const XMLCh *  const,
const XMLCh *const  qname,
const xercesc::Attributes &  attributes 
)
protectedvirtual

Docu in base class.

Reimplemented from XMLHandler.

void store ( const String filename,
std::vector< ProteinIdentification > &  protein_ids,
std::vector< PeptideIdentification > &  peptide_ids 
)

Stores idXML as PepXML file.

Exceptions
Exception::UnableToCreateFileis thrown if the file could not be opened for writing

Member Data Documentation

Int charge_
private

Precursor ion charge.

bool checked_base_name_
private

Have we checked the "base_name" attribute in the "msms_run_summary" element?

std::vector<std::pair<String, Size> > current_modifications_
private

The modifications of the current peptide hit (position is 1-based)

PeptideIdentification current_peptide_
private

PeptideIdentification instance currently being processed.

std::vector<std::vector<ProteinIdentification>::iterator> current_proteins_
private

References to currently active ProteinIdentifications.

String current_sequence_
private

Sequence of the current peptide hit.

DateTime date_
private

Date the pepXML file was generated.

Enyzme associated with the current identification run.

String exp_name_
private

Name of the associated experiment (filename of the data file, extension will be removed)

const MSExperiment* experiment_
private

Pointer to the experiment from which the pepXML file was generated.

std::vector<AminoAcidModification> fixed_modifications_
private

Fixed aminoacid modifications.

Element hydrogen_
private

Hydrogen data (for mass types)

DoubleReal hydrogen_mass_
private

Mass of a hydrogen atom (monoisotopic/average depending on case)

DoubleReal mz_
private
DoubleReal mz_tol_
private

Search parameters of the current identification run.

PeptideHit peptide_hit_
private

PeptideHit instance currently being processed.

std::vector<PeptideIdentification>* peptides_
private

Pointer to the list of identified peptides.

String prot_id_
private

Identifier linking PeptideIdentifications and ProteinIdentifications.

std::vector<ProteinIdentification>* proteins_
private

Pointer to the list of identified proteins.

DoubleReal rt_
private

RT and m/z of current PeptideIdentification.

DoubleReal rt_tol_
private

Retention time and mass-to-charge tolerance.

std::map<Size, Size> scan_map_
private

Mapping between scan number in the pepXML file and index in the corresponding MSExperiment.

String search_engine_
private

Set name of search engine.

UInt search_id_
private

ID of current search result.

bool seen_experiment_
private

Have we seen the experiment of interest at all?

bool use_precursor_data_
private

Get RT and m/z for peptide ID from precursor scan (should only matter for RT)?

std::vector<AminoAcidModification> variable_modifications_
private

Variable aminoacid modifications.

bool wrong_experiment_
private

Do current entries belong to the experiment of interest (for pepXML files that bundle results from different experiments)?


OpenMS / TOPP release 1.11.1 Documentation generated on Thu Nov 14 2013 11:19:31 using doxygen 1.8.5