Class that uses SEQAN library for a suffix array. It can be used to find peptide Candidates for a MS spectrum. More...
#include <OpenMS/DATASTRUCTURES/SuffixArraySeqan.h>
Public Member Functions | |
SuffixArraySeqan (const String &st, const String &filename, const WeightWrapper::WEIGHTMODE weight_mode=WeightWrapper::MONO) | |
constructor More... | |
SuffixArraySeqan (const SuffixArraySeqan &source) | |
copy constructor More... | |
virtual | ~SuffixArraySeqan () |
destructor More... | |
String | toString () |
converts suffix array to a printable string More... | |
void | findSpec (std::vector< std::vector< std::pair< std::pair< SignedSize, SignedSize >, DoubleReal > > > &candidates, const std::vector< DoubleReal > &spec) |
the function that will find all peptide candidates for a given spectrum More... | |
bool | save (const String &filename) |
saves the suffix array to disc More... | |
bool | open (const String &filename) |
opens the suffix array More... | |
void | setTolerance (DoubleReal t) |
setter for tolerance More... | |
DoubleReal | getTolerance () const |
getter for tolerance More... | |
bool | isDigestingEnd (const char aa1, const char aa2) const |
returns if an enzyme will cut after first character More... | |
void | setTags (const std::vector< OpenMS::String > &tags) |
setter for tags More... | |
const std::vector < OpenMS::String > & | getTags () |
getter for tags More... | |
void | setUseTags (bool use_tags) |
setter for use_tags More... | |
bool | getUseTags () |
getter for use_tags More... | |
void | setNumberOfModifications (Size number_of_mods) |
setter for number of modifications More... | |
Size | getNumberOfModifications () |
getter for number of modifications More... | |
void | printStatistic () |
output for statistic More... | |
![]() | |
SuffixArray (const String &st, const String &filename) | |
constructor taking the string and the filename for writing or reading More... | |
SuffixArray (const SuffixArray &sa) | |
copy constructor More... | |
virtual | ~SuffixArray ()=0 |
destructor More... | |
SuffixArray () | |
constructor More... | |
![]() | |
WeightWrapper () | |
constructor More... | |
WeightWrapper (const WEIGHTMODE weight_mode) | |
constructor More... | |
virtual | ~WeightWrapper () |
destructor More... | |
WeightWrapper (const WeightWrapper &source) | |
copy constructor More... | |
void | setWeightMode (const WEIGHTMODE mode) |
Sets the weight mode (MONO or AVERAGE) More... | |
WEIGHTMODE | getWeightMode () const |
Gets the weight mode (MONO or AVERAGE) More... | |
DoubleReal | getWeight (const AASequence &aa) const |
returns the weight of either mono or average value More... | |
DoubleReal | getWeight (const EmpiricalFormula &ef) const |
returns the weight of either mono or average value More... | |
DoubleReal | getWeight (const Residue &r, Residue::ResidueType res_type=Residue::Full) const |
returns the weight of either mono or average value More... | |
Protected Member Functions | |
void | goNextSubTree_ (TIter &it, DoubleReal &m, std::stack< DoubleReal > &allm, std::stack< std::map< DoubleReal, SignedSize > > &mod_map) |
overwriting goNextSubTree_ from seqan index_esa_stree.h for mass update during suffix array traversal More... | |
void | goNextSubTree_ (TIter &it) |
goes to the next sub tree More... | |
void | goNext_ (TIter &it, DoubleReal &m, std::stack< DoubleReal > &allm, std::stack< std::map< DoubleReal, SignedSize > > &mod_map) |
overwriting goNext from seqan index_esa_stree.h for mass update during suffix array traversal More... | |
void | parseTree_ (TIter &it, std::vector< std::pair< SignedSize, SignedSize > > &out_number, std::vector< std::pair< SignedSize, SignedSize > > &edge_length, std::vector< SignedSize > &leafe_depth) |
SignedSize | findFirst_ (const std::vector< DoubleReal > &spec, DoubleReal &m) |
binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance. More... | |
SignedSize | findFirst_ (const std::vector< DoubleReal > &spec, DoubleReal &m, SignedSize start, SignedSize end) |
binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance. it searches recursivly. More... | |
Protected Attributes | |
TIndex | index_ |
seqan suffix array More... | |
TIter * | it_ |
seqan suffix array iterator More... | |
const String & | s_ |
reference to strings for which the suffix array is build More... | |
DoubleReal | masse_ [255] |
amino acid masses More... | |
SignedSize | number_of_modifications_ |
number of allowed modifications More... | |
std::vector< String > | tags_ |
all tags More... | |
bool | use_tags_ |
if tags are used More... | |
DoubleReal | tol_ |
tolerance More... | |
Private Types | |
typedef seqan::TopDown < seqan::ParentLinks<> > | TIterSpec |
typedef seqan::Index < seqan::String< char > , seqan::IndexEsa< TIterSpec > > | TIndex |
typedef seqan::Iter< TIndex, seqan::VSTree< TIterSpec > > | TIter |
Additional Inherited Members | |
![]() | |
enum | WEIGHTMODE { AVERAGE = 0, MONO, SIZE_OF_WEIGHTMODE } |
Class that uses SEQAN library for a suffix array. It can be used to find peptide Candidates for a MS spectrum.
This class uses SEQAN suffix array. It can just be used for finding peptide Candidates for a given MS Spectrum within a certain mass tolerance. The suffix array can be saved to disc for reused so it has to be build just once.
|
private |
SuffixArraySeqan | ( | const String & | st, |
const String & | filename, | ||
const WeightWrapper::WEIGHTMODE | weight_mode = WeightWrapper::MONO |
||
) |
constructor
st | const string reference with the string for which the suffix array should be build |
filename | const string reference with filename for opening or saving the suffix array |
weight_mode | if not monoistopic weight should be used, this parameters can be set to AVERAGE |
FileNotFound | is thrown if the given file is not found |
InvalidValue | if the given suffix array string is invalid |
SuffixArraySeqan | ( | const SuffixArraySeqan & | source | ) |
copy constructor
|
virtual |
destructor
|
protected |
binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance.
spec | const reference to spectrum |
m | mass |
|
protected |
binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance. it searches recursivly.
spec | const reference to spectrum |
m | mass |
start | start index |
end | end index |
|
virtual |
the function that will find all peptide candidates for a given spectrum
spec | const reference of DoubleReal vector describing the spectrum |
candidates | output parameters which holds the candidates of the masses given in spec after call |
for every mass within the spectrum all candidates described by as pairs of ints are returned. All masses are searched for the same time in just one suffix array traversal. In order to accelerate the traversal the skip and lcp table are used. The mass wont be calculated for each entry but it will be updated during traversal using a stack datastructure
Implements SuffixArray.
|
virtual |
|
virtual |
|
virtual |
|
virtual |
|
inlineprotected |
overwriting goNext from seqan index_esa_stree.h for mass update during suffix array traversal
the suffix array is treated as a suffix tree. this function goes to the next node that has not been visited yet. During this traversal the mass will be updated using the stack with edge masses.
it | reference to the suffix array iterator |
m | reference to actual mass |
allm | reference to the stack with history of traversal |
mod_map | input parameters which specifies the modification masses allowed in the candidates |
|
inlineprotected |
overwriting goNextSubTree_ from seqan index_esa_stree.h for mass update during suffix array traversal
the suffix array is treated as a suffix tree. this function skips the subtree under the actual node and goes directly to the next subtree that has not been visited yet. During this traversal the mass will be updated using the stack with edge masses.
it | reference to the suffix array iterator |
m | reference to actual mass |
allm | reference to the stack with history of traversal |
mod_map | input parameters which specifies the modification massen allowed in the candidates |
|
inlineprotected |
goes to the next sub tree
it | reference to the suffix array iterator |
|
virtual |
returns if an enzyme will cut after first character
aa1 | const char as first aminoacid |
aa2 | const char as second aminoacid |
Implements SuffixArray.
Reimplemented in SuffixArrayTrypticSeqan.
|
virtual |
opens the suffix array
filename | const reference string describing the filename |
FileNotFound | is thrown if the given file could not be found |
Implements SuffixArray.
|
inlineprotected |
|
virtual |
output for statistic
Implements SuffixArray.
|
virtual |
saves the suffix array to disc
filename | const reference string describing the filename |
UnableToCreateFile | is thrown if the output files could not be created |
Implements SuffixArray.
|
virtual |
|
virtual |
setter for tags
tags | reference to vector of strings with tags |
Implements SuffixArray.
|
virtual |
setter for tolerance
t | DoubleReal with tolerance, only 0 or greater is allowed |
InvalidValue | is thrown if given tolerance is negative |
Implements SuffixArray.
|
virtual |
setter for use_tags
use_tags | indicating whether tags should be used or not |
Implements SuffixArray.
|
virtual |
converts suffix array to a printable string
Implements SuffixArray.
|
protected |
seqan suffix array
|
protected |
seqan suffix array iterator
|
protected |
amino acid masses
|
protected |
number of allowed modifications
|
protected |
reference to strings for which the suffix array is build
|
protected |
all tags
|
protected |
tolerance
|
protected |
if tags are used
OpenMS / TOPP release 1.11.1 | Documentation generated on Thu Nov 14 2013 11:19:29 using doxygen 1.8.5 |