public class Decorate extends RandomizableIteratedSingleClassifierEnhancer implements TechnicalInformationHandler
@inproceedings{Melville2003, author = {P. Melville and R. J. Mooney}, booktitle = {Eighteenth International Joint Conference on Artificial Intelligence}, pages = {505-510}, title = {Constructing Diverse Classifier Ensembles Using Artificial Training Examples}, year = {2003} } @article{Melville2004, author = {P. Melville and R. J. Mooney}, journal = {Information Fusion: Special Issue on Diversity in Multiclassifier Systems}, note = {submitted}, title = {Creating Diversity in Ensembles Using Artificial Data}, year = {2004} }Valid options are:
-E Desired size of ensemble. (default 15)
-R Factor that determines number of artificial examples to generate. Specified proportional to training set size. (default 1.0)
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 50)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.J48)
Options specific to classifier weka.classifiers.trees.J48:
-U Use unpruned tree.
-C <pruning confidence> Set confidence threshold for pruning. (default 0.25)
-M <minimum number of instances> Set minimum number of instances per leaf. (default 2)
-R Use reduced error pruning.
-N <number of folds> Set number of folds for reduced error pruning. One fold is used as pruning set. (default 3)
-B Use binary splits only.
-S Don't perform subtree raising.
-L Do not clean up after the tree has been built.
-A Laplace smoothing for predicted probabilities.
-Q <seed> Seed for random data shuffling (default 1).Options after -- are passed to the designated classifier.
Modifier and Type | Field and Description |
---|---|
protected double |
m_ArtSize
Amount of artificial/random instances to use - specified as a
fraction of the training data size.
|
protected Vector |
m_AttributeStats
Attribute statistics - used for generating artificial examples.
|
protected Vector |
m_Committee
Vector of classifiers that make up the committee/ensemble.
|
protected int |
m_DesiredSize
The desired ensemble size.
|
protected Random |
m_Random
The random number generator.
|
m_Seed
m_Classifiers, m_NumIterations
m_Classifier
m_Debug
Constructor and Description |
---|
Decorate()
Constructor.
|
Modifier and Type | Method and Description |
---|---|
protected void |
addInstances(Instances data,
Instances newData)
Add new instances to the given set of instances.
|
String |
artificialSizeTipText()
Returns the tip text for this property
|
void |
buildClassifier(Instances data)
Build Decorate classifier
|
protected double |
computeError(Instances data)
Computes the error in classification on the given data.
|
protected void |
computeStats(Instances data)
Compute and store statistics required for generating artificial data.
|
protected String |
defaultClassifierString()
String describing default classifier.
|
String |
desiredSizeTipText()
Returns the tip text for this property
|
double[] |
distributionForInstance(Instance instance)
Calculates the class membership probabilities for the given test instance.
|
protected Instances |
generateArtificialData(int artSize,
Instances data)
Generate artificial training examples.
|
double |
getArtificialSize()
Factor that determines number of artificial examples to generate.
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
int |
getDesiredSize()
Gets the desired size of the committee.
|
String[] |
getOptions()
Gets the current settings of the Classifier.
|
String |
getRevision()
Returns the revision string.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
String |
globalInfo()
Returns a string describing classifier
|
protected int |
inverseLabel(double[] probs)
Select class label such that the probability of selection is
inversely proportional to the ensemble's predictions.
|
protected void |
labelData(Instances artData)
Labels the artificially generated data.
|
Enumeration |
listOptions()
Returns an enumeration describing the available options
|
static void |
main(String[] argv)
Main method for testing this class.
|
String |
numIterationsTipText()
Returns the tip text for this property
|
protected void |
removeInstances(Instances data,
int numRemove)
Removes a specified number of instances from the given set of instances.
|
protected int |
selectIndexProbabilistically(double[] cdf)
Given cumulative probabilities select a nominal attribute value index
|
void |
setArtificialSize(double newArtSize)
Sets factor that determines number of artificial examples to generate.
|
void |
setDesiredSize(int newDesiredSize)
Sets the desired size of the committee.
|
void |
setOptions(String[] options)
Parses a given list of options.
|
String |
toString()
Returns description of the Decorate classifier.
|
getSeed, seedTipText, setSeed
getNumIterations, setNumIterations
classifierTipText, getClassifier, getClassifierSpec, setClassifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, runClassifier, setDebug
protected Vector m_Committee
protected int m_DesiredSize
protected double m_ArtSize
protected Random m_Random
protected Vector m_AttributeStats
protected String defaultClassifierString()
defaultClassifierString
in class SingleClassifierEnhancer
public Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class RandomizableIteratedSingleClassifierEnhancer
public void setOptions(String[] options) throws Exception
-E Desired size of ensemble. (default 15)
-R Factor that determines number of artificial examples to generate. Specified proportional to training set size. (default 1.0)
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 50)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.J48)
Options specific to classifier weka.classifiers.trees.J48:
-U Use unpruned tree.
-C <pruning confidence> Set confidence threshold for pruning. (default 0.25)
-M <minimum number of instances> Set minimum number of instances per leaf. (default 2)
-R Use reduced error pruning.
-N <number of folds> Set number of folds for reduced error pruning. One fold is used as pruning set. (default 3)
-B Use binary splits only.
-S Don't perform subtree raising.
-L Do not clean up after the tree has been built.
-A Laplace smoothing for predicted probabilities.
-Q <seed> Seed for random data shuffling (default 1).Options after -- are passed to the designated classifier.
setOptions
in interface OptionHandler
setOptions
in class RandomizableIteratedSingleClassifierEnhancer
options
- the list of options as an array of stringsException
- if an option is not supportedpublic String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class RandomizableIteratedSingleClassifierEnhancer
public String desiredSizeTipText()
public String numIterationsTipText()
numIterationsTipText
in class IteratedSingleClassifierEnhancer
public String artificialSizeTipText()
public String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public double getArtificialSize()
public void setArtificialSize(double newArtSize)
newArtSize
- factor that determines number of artificial examples to generatepublic int getDesiredSize()
public void setDesiredSize(int newDesiredSize)
newDesiredSize
- the desired size of the committeepublic Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class SingleClassifierEnhancer
Capabilities
public void buildClassifier(Instances data) throws Exception
buildClassifier
in class IteratedSingleClassifierEnhancer
data
- the training data to be used for generating the classifierException
- if the classifier could not be built successfullyprotected void computeStats(Instances data) throws Exception
data
- training instancesException
- if statistics could not be calculated successfullyprotected Instances generateArtificialData(int artSize, Instances data)
artSize
- size of examples set to createdata
- training dataprotected void labelData(Instances artData) throws Exception
artData
- the artificially generated instancesException
- if instances cannot be labeled successfullyprotected int inverseLabel(double[] probs) throws Exception
probs
- class membership probabilities of instanceException
- if instances cannot be labeled successfullyprotected int selectIndexProbabilistically(double[] cdf)
cdf
- array of cumulative probabilitiesprotected void removeInstances(Instances data, int numRemove)
data
- given instancesnumRemove
- number of instances to delete from the given instancesprotected void addInstances(Instances data, Instances newData)
data
- given instancesnewData
- set of instances to add to given instancesprotected double computeError(Instances data) throws Exception
data
- the instances to be classifiedException
- if error can not be computed successfullypublic double[] distributionForInstance(Instance instance) throws Exception
distributionForInstance
in class Classifier
instance
- the instance to be classifiedException
- if distribution can't be computed successfullypublic String toString()
public String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class Classifier
public static void main(String[] argv)
argv
- the optionsCopyright © 2015 University of Waikato, Hamilton, NZ. All rights reserved.