public class RacedIncrementalLogitBoost extends RandomizableSingleClassifierEnhancer implements UpdateableClassifier, TechnicalInformationHandler
@inproceedings{Frank2002, author = {Eibe Frank and Geoffrey Holmes and Richard Kirkby and Mark Hall}, booktitle = {Proceedings of the 5th International Conferenceon Discovery Science}, pages = {153-164}, publisher = {Springer}, title = { Racing committees for large datasets}, year = {2002} }Valid options are:
-C <num> Minimum size of chunks. (default 500)
-M <num> Maximum size of chunks. (default 2000)
-V <num> Size of validation set. (default 1000)
-P <pruning type> Committee pruning to perform. 0=none, 1=log likelihood (default)
-Q Use resampling for boosting.
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.DecisionStump)
Options specific to classifier weka.classifiers.trees.DecisionStump:
-D If set, classifier is run in debug mode and may output additional info to the consoleOptions after -- are passed to the designated learner.
Modifier and Type | Class and Description |
---|---|
protected class |
RacedIncrementalLogitBoost.Committee
Class representing a committee of LogitBoosted models
|
Modifier and Type | Field and Description |
---|---|
protected RacedIncrementalLogitBoost.Committee |
m_bestCommittee
The current best committee
|
protected Attribute |
m_ClassAttribute
The actual class attribute (for getting class names)
|
protected FastVector |
m_committees
The committees
|
protected Instances |
m_currentSet
The instances currently in memory for training
|
protected int |
m_maxBatchSizeRequired
The maximum number of instances required for processing
|
protected int |
m_maxChunkSize
The maimum chunk size used for training
|
protected int |
m_minChunkSize
The minimum chunk size used for training
|
protected int |
m_NumClasses
The number of classes
|
protected Instances |
m_NumericClassData
Dummy dataset with a numeric class
|
protected int |
m_numInstancesConsumed
The number of instances consumed
|
protected int |
m_PruningType
The pruning type used
|
protected Random |
m_RandomInstance
The random number generator used
|
protected boolean |
m_UseResampling
Whether to use resampling
|
protected int |
m_validationChunkSize
The size of the validation set
|
protected Instances |
m_validationSet
The instances used for validation
|
protected boolean |
m_validationSetChanged
Whether the validation set has recently been changed
|
protected ZeroR |
m_zeroR
The default scheme used when committees aren't ready
|
static int |
PRUNETYPE_LOGLIKELIHOOD
log likelihood pruning
|
static int |
PRUNETYPE_NONE
no pruning
|
static Tag[] |
TAGS_PRUNETYPE
The pruning types
|
protected static double |
Z_MAX
A threshold for responses (Friedman suggests between 2 and 4)
|
m_Seed
m_Classifier
m_Debug
Constructor and Description |
---|
RacedIncrementalLogitBoost()
Constructor.
|
Modifier and Type | Method and Description |
---|---|
void |
buildClassifier(Instances data)
Builds the classifier.
|
protected String |
defaultClassifierString()
String describing default classifier.
|
double[] |
distributionForInstance(Instance instance)
Computes class distribution of an instance using the best committee.
|
int |
getBestCommitteeChunkSize()
Get the best committee chunk size
|
double |
getBestCommitteeErrorEstimate()
Get the best committee's error on the validation data
|
double |
getBestCommitteeLLEstimate()
Get the best committee's log likelihood on the validation data
|
int |
getBestCommitteeSize()
Get the number of members in the best committee
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
int |
getMaxChunkSize()
Get the maximum chunk size
|
int |
getMinChunkSize()
Get the minimum chunk size
|
String[] |
getOptions()
Gets the current settings of the Classifier.
|
SelectedTag |
getPruningType()
Get the pruning type
|
String |
getRevision()
Returns the revision string.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
boolean |
getUseResampling()
Get whether resampling is turned on
|
int |
getValidationChunkSize()
Get the validation chunk size
|
String |
globalInfo() |
Enumeration |
listOptions()
Returns an enumeration describing the available options
|
static void |
main(String[] argv)
Main method for this class.
|
String |
maxChunkSizeTipText() |
String |
minChunkSizeTipText() |
String |
pruningTypeTipText() |
protected static double |
RtoP(double[] Fs,
int j)
Convert from function responses to probabilities
|
void |
setClassifier(Classifier newClassifier)
Set the base learner.
|
void |
setMaxChunkSize(int chunkSize)
Set the maximum chunk size
|
void |
setMinChunkSize(int chunkSize)
Set the minimum chunk size
|
void |
setOptions(String[] options)
Parses a given list of options.
|
void |
setPruningType(SelectedTag pruneType)
Set the pruning type
|
void |
setUseResampling(boolean r)
Set resampling mode
|
void |
setValidationChunkSize(int chunkSize)
Set the validation chunk size
|
String |
toString()
Returns description of the boosted classifier.
|
void |
updateClassifier(Instance instance)
Updates the classifier.
|
String |
useResamplingTipText() |
String |
validationChunkSizeTipText() |
getSeed, seedTipText, setSeed
classifierTipText, getClassifier, getClassifierSpec
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, runClassifier, setDebug
public static final int PRUNETYPE_NONE
public static final int PRUNETYPE_LOGLIKELIHOOD
public static final Tag[] TAGS_PRUNETYPE
protected FastVector m_committees
protected int m_PruningType
protected boolean m_UseResampling
protected int m_NumClasses
protected static final double Z_MAX
protected Instances m_NumericClassData
protected Attribute m_ClassAttribute
protected int m_minChunkSize
protected int m_maxChunkSize
protected int m_validationChunkSize
protected int m_numInstancesConsumed
protected Instances m_validationSet
protected Instances m_currentSet
protected RacedIncrementalLogitBoost.Committee m_bestCommittee
protected ZeroR m_zeroR
protected boolean m_validationSetChanged
protected int m_maxBatchSizeRequired
protected Random m_RandomInstance
protected String defaultClassifierString()
defaultClassifierString
in class SingleClassifierEnhancer
public Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class SingleClassifierEnhancer
Capabilities
public void buildClassifier(Instances data) throws Exception
buildClassifier
in class Classifier
data
- the instances to train the classifier withException
- if something goes wrongpublic void updateClassifier(Instance instance) throws Exception
updateClassifier
in interface UpdateableClassifier
instance
- the next instance in the stream of training dataException
- if something goes wrongprotected static double RtoP(double[] Fs, int j) throws Exception
Fs
- an array containing the responses from each functionj
- the class value of interestException
- if can't normalizepublic double[] distributionForInstance(Instance instance) throws Exception
distributionForInstance
in class Classifier
instance
- the instance to get the distribution forException
- if anything goes wrongpublic Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class RandomizableSingleClassifierEnhancer
public void setOptions(String[] options) throws Exception
-C <num> Minimum size of chunks. (default 500)
-M <num> Maximum size of chunks. (default 2000)
-V <num> Size of validation set. (default 1000)
-P <pruning type> Committee pruning to perform. 0=none, 1=log likelihood (default)
-Q Use resampling for boosting.
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.DecisionStump)
Options specific to classifier weka.classifiers.trees.DecisionStump:
-D If set, classifier is run in debug mode and may output additional info to the console
setOptions
in interface OptionHandler
setOptions
in class RandomizableSingleClassifierEnhancer
options
- the list of options as an array of stringsException
- if an option is not supportedpublic String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class RandomizableSingleClassifierEnhancer
public String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public void setClassifier(Classifier newClassifier)
setClassifier
in class SingleClassifierEnhancer
newClassifier
- the classifier to use.IllegalArgumentException
- if base classifier cannot handle numeric
classpublic String minChunkSizeTipText()
public void setMinChunkSize(int chunkSize)
chunkSize
- the minimum chunk sizepublic int getMinChunkSize()
public String maxChunkSizeTipText()
public void setMaxChunkSize(int chunkSize)
chunkSize
- the maximum chunk sizepublic int getMaxChunkSize()
public String validationChunkSizeTipText()
public void setValidationChunkSize(int chunkSize)
chunkSize
- the validation chunk sizepublic int getValidationChunkSize()
public String pruningTypeTipText()
public void setPruningType(SelectedTag pruneType)
pruneType
- the pruning typepublic SelectedTag getPruningType()
public String useResamplingTipText()
public void setUseResampling(boolean r)
r
- true if resampling should be donepublic boolean getUseResampling()
public int getBestCommitteeChunkSize()
public int getBestCommitteeSize()
public double getBestCommitteeErrorEstimate()
public double getBestCommitteeLLEstimate()
public String toString()
public String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class Classifier
public static void main(String[] argv)
argv
- the commandline parametersCopyright © 2015 University of Waikato, Hamilton, NZ. All rights reserved.