public class CheckClusterer extends CheckScheme
java weka.clusterers.CheckClusterer -W clusterer_name
-- clusterer_options
CheckClusterer reports on the following:
weka.clusterers.AbstractClustererTest
uses this
class to test all the clusterers. Any changes here, have to be
checked in that abstract test class, too.
Valid options are:
-D Turn on debugging output.
-S Silent mode - prints nothing to stdout.
-N <num> The number of instances in the datasets (default 20).
-nominal <num> The number of nominal attributes (default 2).
-nominal-values <num> The number of values for nominal attributes (default 1).
-numeric <num> The number of numeric attributes (default 1).
-string <num> The number of string attributes (default 1).
-date <num> The number of date attributes (default 1).
-relational <num> The number of relational attributes (default 1).
-num-instances-relational <num> The number of instances in relational/bag attributes (default 10).
-words <comma-separated-list> The words to use in string attributes.
-word-separators <chars> The word separators to use in string attributes.
-W Full name of the clusterer analyzed. eg: weka.clusterers.SimpleKMeans (default weka.clusterers.SimpleKMeans)
Options specific to clusterer weka.clusterers.SimpleKMeans:
-N <num> number of clusters. (default 2).
-V Display std. deviations for centroids.
-M Replace missing values with mean/mode.
-S <num> Random number seed. (default 10)Options after -- are passed to the designated clusterer.
TestInstances
CheckScheme.PostProcessor
Modifier and Type | Field and Description |
---|---|
protected Clusterer |
m_Clusterer
The clusterer to be examined
|
m_ClasspathProblems, m_NumDate, m_NumInstances, m_NumInstancesRelational, m_NumNominal, m_NumNumeric, m_NumRelational, m_NumString, m_PostProcessor, m_Words, m_WordSeparators
Constructor and Description |
---|
CheckClusterer()
default constructor
|
Modifier and Type | Method and Description |
---|---|
protected void |
addMissing(Instances data,
int level,
boolean predictorMissing)
Add missing values to a dataset.
|
protected boolean[] |
canHandleMissing(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
boolean predictorMissing,
int missingLevel)
Checks basic missing value handling of the scheme.
|
protected boolean[] |
canHandleZeroTraining(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance)
Checks whether the scheme can handle zero training instances.
|
protected boolean[] |
canPredict(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance)
Checks basic prediction of the scheme, for simple non-troublesome
datasets.
|
protected boolean[] |
canTakeOptions()
Checks whether the scheme can take command line options.
|
protected boolean[] |
correctBuildInitialisation(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance)
Checks whether the scheme correctly initialises models when
buildClusterer is called.
|
protected boolean[] |
datasetIntegrity(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
boolean predictorMissing)
Checks whether the scheme alters the training dataset during
training.
|
protected boolean[] |
declaresSerialVersionUID()
tests for a serialVersionUID.
|
void |
doTests()
Begin the tests, reporting results to System.out
|
Clusterer |
getClusterer()
Get the clusterer used as the clusterer
|
String[] |
getOptions()
Gets the current settings of the CheckClusterer.
|
String |
getRevision()
Returns the revision string.
|
protected boolean[] |
instanceWeights(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance)
Checks whether the clusterer can handle instance weights.
|
Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(String[] args)
Test method for this class
|
protected Instances |
makeTestDataset(int seed,
int numInstances,
int numNominal,
int numNumeric,
int numString,
int numDate,
int numRelational,
boolean multiInstance)
Make a simple set of instances with variable position of the class
attribute, which can later be modified for use in specific tests.
|
protected boolean[] |
multiInstanceHandler()
Checks whether the scheme handles multi-instance data.
|
protected void |
printAttributeSummary(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance)
Print out a short summary string for the dataset characteristics
|
protected boolean[] |
runBasicTest(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int missingLevel,
boolean predictorMissing,
int numTrain,
FastVector accepts)
Runs a text on the datasets with the given characteristics.
|
protected void |
runTests(boolean weighted,
boolean multiInstance,
boolean updateable)
Run a battery of tests
|
void |
setClusterer(Clusterer newClusterer)
Set the clusterer for testing.
|
void |
setOptions(String[] options)
Parses a given list of options.
|
protected boolean[] |
updateableClusterer()
Checks whether the scheme can build models incrementally.
|
protected boolean[] |
updatingEquality(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance)
Checks whether an updateable scheme produces the same model when
trained incrementally as when batch trained.
|
protected boolean[] |
weightedInstancesHandler()
Checks whether the scheme says it can handle instance weights.
|
addMissing, arrayToList, attributeTypeToString, compareDatasets, getNumDate, getNumInstances, getNumInstancesRelational, getNumNominal, getNumNumeric, getNumRelational, getNumString, getPostProcessor, getWords, getWordSeparators, hasClasspathProblems, listToArray, process, setNumDate, setNumInstances, setNumInstancesRelational, setNumNominal, setNumNumeric, setNumRelational, setNumString, setPostProcessor, setWords, setWordSeparators
protected Clusterer m_Clusterer
public Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class CheckScheme
public void setOptions(String[] options) throws Exception
-D Turn on debugging output.
-S Silent mode - prints nothing to stdout.
-N <num> The number of instances in the datasets (default 20).
-nominal <num> The number of nominal attributes (default 2).
-nominal-values <num> The number of values for nominal attributes (default 1).
-numeric <num> The number of numeric attributes (default 1).
-string <num> The number of string attributes (default 1).
-date <num> The number of date attributes (default 1).
-relational <num> The number of relational attributes (default 1).
-num-instances-relational <num> The number of instances in relational/bag attributes (default 10).
-words <comma-separated-list> The words to use in string attributes.
-word-separators <chars> The word separators to use in string attributes.
-W Full name of the clusterer analyzed. eg: weka.clusterers.SimpleKMeans (default weka.clusterers.SimpleKMeans)
Options specific to clusterer weka.clusterers.SimpleKMeans:
-N <num> number of clusters. (default 2).
-V Display std. deviations for centroids.
-M Replace missing values with mean/mode.
-S <num> Random number seed. (default 10)
setOptions
in interface OptionHandler
setOptions
in class CheckScheme
options
- the list of options as an array of stringsException
- if an option is not supportedpublic String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class CheckScheme
public void doTests()
doTests
in class CheckScheme
public void setClusterer(Clusterer newClusterer)
newClusterer
- the Clusterer to use.public Clusterer getClusterer()
protected void runTests(boolean weighted, boolean multiInstance, boolean updateable)
weighted
- true if the clusterer says it handles weightsmultiInstance
- true if the clusterer is a multi-instance clustererupdateable
- true if the classifier is updateableprotected boolean[] canTakeOptions()
protected boolean[] updateableClusterer()
protected boolean[] weightedInstancesHandler()
protected boolean[] multiInstanceHandler()
protected boolean[] declaresSerialVersionUID()
protected boolean[] canPredict(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededprotected boolean[] canHandleZeroTraining(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededprotected boolean[] correctBuildInitialisation(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededprotected boolean[] canHandleMissing(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, boolean predictorMissing, int missingLevel)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededpredictorMissing
- true if the missing values may be in
the predictorsmissingLevel
- the percentage of missing valuesprotected boolean[] instanceWeights(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededprotected boolean[] datasetIntegrity(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, boolean predictorMissing)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededpredictorMissing
- true if we know the clusterer can handle
(at least) moderate missing predictor valuesprotected boolean[] updatingEquality(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededprotected boolean[] runBasicTest(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int missingLevel, boolean predictorMissing, int numTrain, FastVector accepts)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededmissingLevel
- the percentage of missing valuespredictorMissing
- true if the missing values may be in
the predictorsnumTrain
- the number of instances in the training setaccepts
- the acceptable string in an exceptionprotected void addMissing(Instances data, int level, boolean predictorMissing)
data
- the instances to add missing values tolevel
- the level of missing values to add (if positive, this
is the probability that a value will be set to missing, if negative
all but one value will be set to missing (not yet implemented))predictorMissing
- if true, predictor attributes will be modifiedprotected Instances makeTestDataset(int seed, int numInstances, int numNominal, int numNumeric, int numString, int numDate, int numRelational, boolean multiInstance) throws Exception
seed
- the random number seednumInstances
- the number of instances to generatenumNominal
- the number of nominal attributesnumNumeric
- the number of numeric attributesnumString
- the number of string attributesnumDate
- the number of date attributesnumRelational
- the number of relational attributesmultiInstance
- whether the dataset should a multi-instance datasetException
- if the dataset couldn't be generatedTestInstances.CLASS_IS_LAST
protected void printAttributeSummary(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance)
nominalPredictor
- true if nominal predictor attributes are presentnumericPredictor
- true if numeric predictor attributes are presentstringPredictor
- true if string predictor attributes are presentdatePredictor
- true if date predictor attributes are presentrelationalPredictor
- true if relational predictor attributes are presentmultiInstance
- whether multi-instance is neededpublic String getRevision()
public static void main(String[] args)
args
- the commandline optionsCopyright © 2015 University of Waikato, Hamilton, NZ. All rights reserved.