Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
Public Member Functions | Private Member Functions | Private Attributes | List of all members
PosteriorErrorProbabilityModel Class Reference

Implements a mixture model of the inverse gumbel and the gauss distribution or a gaussian mixture. More...

#include <OpenMS/MATH/STATISTICS/PosteriorErrorProbabilityModel.h>

Inheritance diagram for PosteriorErrorProbabilityModel:
DefaultParamHandler

Public Member Functions

 PosteriorErrorProbabilityModel ()
 default constructor More...
 
virtual ~PosteriorErrorProbabilityModel ()
 Destructor. More...
 
bool fit (std::vector< double > &search_engine_scores)
 fits the distributions to the data points(search_engine_scores). Estimated parameters for the distributions are saved in member variables. computeProbability can be used afterwards. More...
 
bool fit (std::vector< double > &search_engine_scores, std::vector< double > &probabilities)
 fits the distributions to the data points(search_engine_scores) and writes the computed probabilites into the given vector (the second one). More...
 
void fillDensities (std::vector< double > &x_scores, std::vector< DoubleReal > &incorrect_density, std::vector< DoubleReal > &correct_density)
 Writes the distributions densities into the two vectors for a set of scores. Incorrect_densities represent the incorreclty assigned seqeuences. More...
 
DoubleReal computeMaxLikelihood (std::vector< DoubleReal > &incorrect_density, std::vector< DoubleReal > &correct_density)
 computes the Maximum Likelihood with a log-likelihood funciotn. More...
 
DoubleReal one_minus_sum_post (std::vector< DoubleReal > &incorrect_density, std::vector< DoubleReal > &correct_density)
 sums (1 - posterior porbabilities) More...
 
DoubleReal sum_post (std::vector< DoubleReal > &incorrect_density, std::vector< DoubleReal > &correct_density)
 sums posterior porbabilities More...
 
DoubleReal sum_pos_x0 (std::vector< double > &x_scores, std::vector< DoubleReal > &incorrect_density, std::vector< DoubleReal > &correct_density)
 helper function for the EM algorithm (for fitting) More...
 
DoubleReal sum_neg_x0 (std::vector< double > &x_scores, std::vector< DoubleReal > &incorrect_density, std::vector< DoubleReal > &correct_density)
 helper function for the EM algorithm (for fitting) More...
 
DoubleReal sum_pos_sigma (std::vector< double > &x_scores, std::vector< DoubleReal > &incorrect_density, std::vector< DoubleReal > &correct_density, DoubleReal positive_mean)
 helper function for the EM algorithm (for fitting) More...
 
DoubleReal sum_neg_sigma (std::vector< double > &x_scores, std::vector< DoubleReal > &incorrect_density, std::vector< DoubleReal > &correct_density, DoubleReal positive_mean)
 helper function for the EM algorithm (for fitting) More...
 
GaussFitter::GaussFitResult getCorrectlyAssignedFitResult () const
 returns estimated parameters for correctly assigned sequences. Fit should be used before. More...
 
GaussFitter::GaussFitResult getIncorrectlyAssignedFitResult () const
 returns estimated parameters for correctly assigned sequences. Fit should be used before. More...
 
DoubleReal getNegativePrior () const
 returns the estimated negative prior probability. More...
 
DoubleReal getGauss (DoubleReal x, const GaussFitter::GaussFitResult &params)
 computes the gaussian density at position x with parameters params. More...
 
DoubleReal getGumbel (DoubleReal x, const GaussFitter::GaussFitResult &params)
 computes the gumbel density at position x with parameters params. More...
 
DoubleReal computeProbability (DoubleReal score)
 
TextFileInitPlots (std::vector< double > &x_scores)
 initializes the plots More...
 
const String getGumbelGnuplotFormula (const GaussFitter::GaussFitResult &params) const
 returns the gnuplot formula of the fitted gumbel distribution. Only x0 and sigma are used as local parameter alpha and scale parameter beta, respectively. More...
 
const String getGaussGnuplotFormula (const GaussFitter::GaussFitResult &params) const
 returns the gnuplot formula of the fitted gauss distribution. More...
 
const String getBothGnuplotFormula (const GaussFitter::GaussFitResult &incorrect, const GaussFitter::GaussFitResult &correct) const
 returns the gnuplot formula of the fitted mixture distribution. More...
 
void plotTargetDecoyEstimation (std::vector< double > &target, std::vector< double > &decoy)
 plots the estimated distribution against target and decoy hits More...
 
DoubleReal getSmallestScore ()
 returns the smallest score used in the last fit More...
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
virtual DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 

Private Member Functions

PosteriorErrorProbabilityModeloperator= (const PosteriorErrorProbabilityModel &rhs)
 assignment operator (not implemented) More...
 
 PosteriorErrorProbabilityModel (const PosteriorErrorProbabilityModel &rhs)
 Copy constructor (not implemented) More...
 

Private Attributes

GaussFitter::GaussFitResult incorrectly_assigned_fit_param_
 stores parameters for incorrectly assigned sequences. If gumbel fit was used, A can be ignored. Furthermore, in this case, x0 and sigma are the local parameter alpha and scale parameter beta, respectively. More...
 
GaussFitter::GaussFitResult correctly_assigned_fit_param_
 stores gauss parameters More...
 
DoubleReal negative_prior_
 stores final prior probability for negative peptides More...
 
DoubleReal max_incorrectly_
 peak of the incorrectly assigned sequences distribution More...
 
DoubleReal max_correctly_
 peak of the gauss distribution (correctly assigned sequences) More...
 
DoubleReal smallest_score_
 smallest score which was used for fitting the model More...
 
DoubleReal(PosteriorErrorProbabilityModel::* calc_incorrect_ )(DoubleReal x, const GaussFitter::GaussFitResult &params)
 points to getGauss More...
 
DoubleReal(PosteriorErrorProbabilityModel::* calc_correct_ )(DoubleReal x, const GaussFitter::GaussFitResult &params)
 points either to getGumbel or getGauss depending on whether on uses the gumbel or th gausian distribution for incorrectly assigned sequences. More...
 
const String(PosteriorErrorProbabilityModel::* getNegativeGnuplotFormula_ )(const GaussFitter::GaussFitResult &params) const
 points either to getGumbelGnuplotFormula or getGaussGnuplotFormula depending on whether on uses the gumbel or th gausian distribution for incorrectly assigned sequences. More...
 
const String(PosteriorErrorProbabilityModel::* getPositiveGnuplotFormula_ )(const GaussFitter::GaussFitResult &params) const
 points to getGumbelGnuplotFormula More...
 

Additional Inherited Members

- Protected Member Functions inherited from DefaultParamHandler
virtual void updateMembers_ ()
 This method is used to update extra member variables at the end of the setParameters() method. More...
 
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 

Detailed Description

Implements a mixture model of the inverse gumbel and the gauss distribution or a gaussian mixture.

This class fits either a Gumbel distribution and a Gauss distribution to a set of data points or two Gaussian distributions using the EM algorithm. One can output the fit as a gnuplot formula using getGumbelGnuplotFormula() and getGaussGnuplotFormula() after fitting.

Note
All paremters are stored in GaussFitResult. In the case of the gumbel distribution x0 and sigma represent the local parameter alpha and the scale parameter beta, respectively.
Parameters of this class are:

NameTypeDefaultRestrictionsDescription
number_of_bins int100  Number of bins used for visualization. Only needed if each iteration step of the EM-Algorithm will be visualized
output_plots stringfalse true, falseIf true every step of the EM-algorithm will be written to a file as a gnuplot formula
output_name string  If output_plots is on, the output files will be saved in the following manner: scores.txt for the scores and which contains each step of the EM-algorithm e.g. output_name = /usr/home/OMSSA123 then /usr/home/OMSSA123_scores.txt, /usr/home/OMSSA123 will be written. If no directory is specified, e.g. instead of '/usr/home/OMSSA123' just OMSSA123, the files will be written into the working directory.
incorrectly_assigned stringGumbel Gumbel, Gaussfor 'Gumbel', the Gumbel distribution is used to plot incorrectly assigned sequences. For 'Gauss', the Gauss distribution is used.

Note:

Constructor & Destructor Documentation

default constructor

virtual ~PosteriorErrorProbabilityModel ( )
virtual

Destructor.

Copy constructor (not implemented)

Member Function Documentation

DoubleReal computeMaxLikelihood ( std::vector< DoubleReal > &  incorrect_density,
std::vector< DoubleReal > &  correct_density 
)

computes the Maximum Likelihood with a log-likelihood funciotn.

DoubleReal computeProbability ( DoubleReal  score)

Returns the computed posterior error probability for a given score.

Note
: fit has to be used before using this function. Otherwise this function will compute nonsense.
void fillDensities ( std::vector< double > &  x_scores,
std::vector< DoubleReal > &  incorrect_density,
std::vector< DoubleReal > &  correct_density 
)

Writes the distributions densities into the two vectors for a set of scores. Incorrect_densities represent the incorreclty assigned seqeuences.

bool fit ( std::vector< double > &  search_engine_scores)

fits the distributions to the data points(search_engine_scores). Estimated parameters for the distributions are saved in member variables. computeProbability can be used afterwards.

Parameters
search_engine_scoresa vector which holds the data points
Returns
true if algorithm has run thourgh. Else false will be returned. In that case no plot and no probabilities are calculated.
Note
the vector is sorted from smallest to biggest value!
bool fit ( std::vector< double > &  search_engine_scores,
std::vector< double > &  probabilities 
)

fits the distributions to the data points(search_engine_scores) and writes the computed probabilites into the given vector (the second one).

Parameters
search_engine_scoresa vector which holds the data points
probabilitiesa vector which holds the probability for each data point after running this function. If it has some content it will be overwritten.
Returns
true if algorithm has run thourgh. Else false will be returned. In that case no plot and no probabilities are calculated.
Note
the vectors are sorted from smallest to biggest value!
const String getBothGnuplotFormula ( const GaussFitter::GaussFitResult incorrect,
const GaussFitter::GaussFitResult correct 
) const

returns the gnuplot formula of the fitted mixture distribution.

GaussFitter::GaussFitResult getCorrectlyAssignedFitResult ( ) const
inline

returns estimated parameters for correctly assigned sequences. Fit should be used before.

DoubleReal getGauss ( DoubleReal  x,
const GaussFitter::GaussFitResult params 
)
inline

computes the gaussian density at position x with parameters params.

References GaussFitter::GaussFitResult::A, GaussFitter::GaussFitResult::sigma, and GaussFitter::GaussFitResult::x0.

const String getGaussGnuplotFormula ( const GaussFitter::GaussFitResult params) const

returns the gnuplot formula of the fitted gauss distribution.

DoubleReal getGumbel ( DoubleReal  x,
const GaussFitter::GaussFitResult params 
)
inline

computes the gumbel density at position x with parameters params.

References GaussFitter::GaussFitResult::sigma, and GaussFitter::GaussFitResult::x0.

const String getGumbelGnuplotFormula ( const GaussFitter::GaussFitResult params) const

returns the gnuplot formula of the fitted gumbel distribution. Only x0 and sigma are used as local parameter alpha and scale parameter beta, respectively.

GaussFitter::GaussFitResult getIncorrectlyAssignedFitResult ( ) const
inline

returns estimated parameters for correctly assigned sequences. Fit should be used before.

DoubleReal getNegativePrior ( ) const
inline

returns the estimated negative prior probability.

DoubleReal getSmallestScore ( )
inline

returns the smallest score used in the last fit

TextFile* InitPlots ( std::vector< double > &  x_scores)

initializes the plots

DoubleReal one_minus_sum_post ( std::vector< DoubleReal > &  incorrect_density,
std::vector< DoubleReal > &  correct_density 
)

sums (1 - posterior porbabilities)

assignment operator (not implemented)

void plotTargetDecoyEstimation ( std::vector< double > &  target,
std::vector< double > &  decoy 
)

plots the estimated distribution against target and decoy hits

DoubleReal sum_neg_sigma ( std::vector< double > &  x_scores,
std::vector< DoubleReal > &  incorrect_density,
std::vector< DoubleReal > &  correct_density,
DoubleReal  positive_mean 
)

helper function for the EM algorithm (for fitting)

DoubleReal sum_neg_x0 ( std::vector< double > &  x_scores,
std::vector< DoubleReal > &  incorrect_density,
std::vector< DoubleReal > &  correct_density 
)

helper function for the EM algorithm (for fitting)

DoubleReal sum_pos_sigma ( std::vector< double > &  x_scores,
std::vector< DoubleReal > &  incorrect_density,
std::vector< DoubleReal > &  correct_density,
DoubleReal  positive_mean 
)

helper function for the EM algorithm (for fitting)

DoubleReal sum_pos_x0 ( std::vector< double > &  x_scores,
std::vector< DoubleReal > &  incorrect_density,
std::vector< DoubleReal > &  correct_density 
)

helper function for the EM algorithm (for fitting)

DoubleReal sum_post ( std::vector< DoubleReal > &  incorrect_density,
std::vector< DoubleReal > &  correct_density 
)

sums posterior porbabilities

Member Data Documentation

DoubleReal(PosteriorErrorProbabilityModel::* calc_correct_)(DoubleReal x, const GaussFitter::GaussFitResult &params)
private

points either to getGumbel or getGauss depending on whether on uses the gumbel or th gausian distribution for incorrectly assigned sequences.

DoubleReal(PosteriorErrorProbabilityModel::* calc_incorrect_)(DoubleReal x, const GaussFitter::GaussFitResult &params)
private

points to getGauss

GaussFitter::GaussFitResult correctly_assigned_fit_param_
private

stores gauss parameters

const String(PosteriorErrorProbabilityModel::* getNegativeGnuplotFormula_)(const GaussFitter::GaussFitResult &params) const
private

points either to getGumbelGnuplotFormula or getGaussGnuplotFormula depending on whether on uses the gumbel or th gausian distribution for incorrectly assigned sequences.

const String(PosteriorErrorProbabilityModel::* getPositiveGnuplotFormula_)(const GaussFitter::GaussFitResult &params) const
private

points to getGumbelGnuplotFormula

GaussFitter::GaussFitResult incorrectly_assigned_fit_param_
private

stores parameters for incorrectly assigned sequences. If gumbel fit was used, A can be ignored. Furthermore, in this case, x0 and sigma are the local parameter alpha and scale parameter beta, respectively.

DoubleReal max_correctly_
private

peak of the gauss distribution (correctly assigned sequences)

DoubleReal max_incorrectly_
private

peak of the incorrectly assigned sequences distribution

DoubleReal negative_prior_
private

stores final prior probability for negative peptides

DoubleReal smallest_score_
private

smallest score which was used for fitting the model


OpenMS / TOPP release 1.11.1 Documentation generated on Thu Nov 14 2013 11:19:38 using doxygen 1.8.5