public class CheckAssociator extends CheckScheme implements RevisionHandler
java weka.associations.CheckAssociator -W associator_name
-- associator_options
CheckAssociator reports on the following:
weka.associations.AbstractAssociatorTest
uses this
class to test all the associators. Any changes here, have to be
checked in that abstract test class, too.
Valid options are:
-D Turn on debugging output.
-S Silent mode - prints nothing to stdout.
-N <num> The number of instances in the datasets (default 20).
-nominal <num> The number of nominal attributes (default 2).
-nominal-values <num> The number of values for nominal attributes (default 1).
-numeric <num> The number of numeric attributes (default 1).
-string <num> The number of string attributes (default 1).
-date <num> The number of date attributes (default 1).
-relational <num> The number of relational attributes (default 1).
-num-instances-relational <num> The number of instances in relational/bag attributes (default 10).
-words <comma-separated-list> The words to use in string attributes.
-word-separators <chars> The word separators to use in string attributes.
-W Full name of the associator analysed. eg: weka.associations.Apriori (default weka.associations.Apriori)
Options specific to associator weka.associations.Apriori:
-N <required number of rules output> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric type by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum confidence of a rule. (default = 0.9)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-S <significance level> If used, rules are tested for significance at the given level. Slower. (default = no significance testing)
-I If set the itemsets found are also output. (default = no)
-R Remove columns that contain all missing values (default = no)
-V Report progress iteratively. (default = no)
-A If set class association rules are mined. (default = no)
-c <the class index> The class index. (default = last)Options after -- are passed to the designated associator.
TestInstances
CheckScheme.PostProcessor
Modifier and Type | Field and Description |
---|---|
protected Associator |
m_Associator
The associator to be examined
|
static int |
NO_CLASS
a "dummy" class type
|
m_ClasspathProblems, m_NumDate, m_NumInstances, m_NumInstancesRelational, m_NumNominal, m_NumNumeric, m_NumRelational, m_NumString, m_PostProcessor, m_Words, m_WordSeparators
Constructor and Description |
---|
CheckAssociator() |
Modifier and Type | Method and Description |
---|---|
protected boolean[] |
canHandleClassAsNthAttribute(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType,
int classIndex)
Checks whether the scheme can handle class attributes as Nth attribute.
|
protected boolean[] |
canHandleMissing(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType,
boolean predictorMissing,
boolean classMissing,
int missingLevel)
Checks basic missing value handling of the scheme.
|
protected boolean[] |
canHandleNClasses(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int numClasses)
Checks whether nominal schemes can handle more than two classes.
|
protected boolean[] |
canHandleZeroTraining(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType)
Checks whether the scheme can handle zero training instances.
|
protected boolean[] |
canPredict(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType)
Checks basic prediction of the scheme, for simple non-troublesome
datasets.
|
protected boolean[] |
canTakeOptions()
Checks whether the scheme can take command line options.
|
protected boolean[] |
correctBuildInitialisation(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType)
Checks whether the scheme correctly initialises models when
buildAssociations is called.
|
protected boolean[] |
datasetIntegrity(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType,
boolean predictorMissing,
boolean classMissing)
Checks whether the scheme alters the training dataset during
building.
|
protected boolean[] |
declaresSerialVersionUID()
tests for a serialVersionUID.
|
void |
doTests()
Begin the tests, reporting results to System.out
|
Associator |
getAssociator()
Get the associator being tested
|
String[] |
getOptions()
Gets the current settings of the CheckAssociator.
|
String |
getRevision()
Returns the revision string.
|
protected boolean[] |
instanceWeights(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType)
Checks whether the associator can handle instance weights.
|
Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(String[] args)
Test method for this class
|
protected Instances |
makeTestDataset(int seed,
int numInstances,
int numNominal,
int numNumeric,
int numString,
int numDate,
int numRelational,
int numClasses,
int classType,
boolean multiInstance)
Make a simple set of instances, which can later be modified
for use in specific tests.
|
protected Instances |
makeTestDataset(int seed,
int numInstances,
int numNominal,
int numNumeric,
int numString,
int numDate,
int numRelational,
int numClasses,
int classType,
int classIndex,
boolean multiInstance)
Make a simple set of instances with variable position of the class
attribute, which can later be modified for use in specific tests.
|
protected boolean[] |
multiInstanceHandler()
Checks whether the scheme handles multi-instance data.
|
protected void |
printAttributeSummary(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType)
Print out a short summary string for the dataset characteristics
|
protected boolean[] |
runBasicTest(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType,
int missingLevel,
boolean predictorMissing,
boolean classMissing,
int numTrain,
int numClasses,
FastVector accepts)
Runs a text on the datasets with the given characteristics.
|
protected boolean[] |
runBasicTest(boolean nominalPredictor,
boolean numericPredictor,
boolean stringPredictor,
boolean datePredictor,
boolean relationalPredictor,
boolean multiInstance,
int classType,
int classIndex,
int missingLevel,
boolean predictorMissing,
boolean classMissing,
int numTrain,
int numClasses,
FastVector accepts)
Runs a text on the datasets with the given characteristics.
|
void |
setAssociator(Associator newAssociator)
Set the associator to test.
|
void |
setOptions(String[] options)
Parses a given list of options.
|
protected void |
testsPerClassType(int classType,
boolean weighted,
boolean multiInstance)
Run a battery of tests for a given class attribute type
|
protected void |
testsWithoutClass(boolean weighted,
boolean multiInstance)
Run a battery of tests without a class
|
protected boolean[] |
weightedInstancesHandler()
Checks whether the scheme says it can handle instance weights.
|
addMissing, arrayToList, attributeTypeToString, compareDatasets, getNumDate, getNumInstances, getNumInstancesRelational, getNumNominal, getNumNumeric, getNumRelational, getNumString, getPostProcessor, getWords, getWordSeparators, hasClasspathProblems, listToArray, process, setNumDate, setNumInstances, setNumInstancesRelational, setNumNominal, setNumNumeric, setNumRelational, setNumString, setPostProcessor, setWords, setWordSeparators
public static final int NO_CLASS
protected Associator m_Associator
public Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class CheckScheme
public void setOptions(String[] options) throws Exception
-D Turn on debugging output.
-S Silent mode - prints nothing to stdout.
-N <num> The number of instances in the datasets (default 20).
-nominal <num> The number of nominal attributes (default 2).
-nominal-values <num> The number of values for nominal attributes (default 1).
-numeric <num> The number of numeric attributes (default 1).
-string <num> The number of string attributes (default 1).
-date <num> The number of date attributes (default 1).
-relational <num> The number of relational attributes (default 1).
-num-instances-relational <num> The number of instances in relational/bag attributes (default 10).
-words <comma-separated-list> The words to use in string attributes.
-word-separators <chars> The word separators to use in string attributes.
-W Full name of the associator analysed. eg: weka.associations.Apriori (default weka.associations.Apriori)
Options specific to associator weka.associations.Apriori:
-N <required number of rules output> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric type by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum confidence of a rule. (default = 0.9)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-S <significance level> If used, rules are tested for significance at the given level. Slower. (default = no significance testing)
-I If set the itemsets found are also output. (default = no)
-R Remove columns that contain all missing values (default = no)
-V Report progress iteratively. (default = no)
-A If set class association rules are mined. (default = no)
-c <the class index> The class index. (default = last)
setOptions
in interface OptionHandler
setOptions
in class CheckScheme
options
- the list of options as an array of stringsException
- if an option is not supportedpublic String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class CheckScheme
public void doTests()
doTests
in class CheckScheme
public void setAssociator(Associator newAssociator)
newAssociator
- the Associator to use.public Associator getAssociator()
protected void testsPerClassType(int classType, boolean weighted, boolean multiInstance)
classType
- true if the class attribute should be numericweighted
- true if the associator says it handles weightsmultiInstance
- true if the associator is a multi-instance associatorprotected void testsWithoutClass(boolean weighted, boolean multiInstance)
weighted
- true if the associator says it handles weightsmultiInstance
- true if the associator is a multi-instance associatorprotected boolean[] canTakeOptions()
protected boolean[] weightedInstancesHandler()
protected boolean[] multiInstanceHandler()
protected boolean[] declaresSerialVersionUID()
protected boolean[] canPredict(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NOMINAL, NUMERIC, etc.)protected boolean[] canHandleNClasses(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int numClasses)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is needednumClasses
- the number of classes to testprotected boolean[] canHandleClassAsNthAttribute(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType, int classIndex)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)classIndex
- the index of the class attribute (0-based, -1 means last attribute)TestInstances.CLASS_IS_LAST
protected boolean[] canHandleZeroTraining(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)protected boolean[] correctBuildInitialisation(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)protected boolean[] canHandleMissing(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType, boolean predictorMissing, boolean classMissing, int missingLevel)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)predictorMissing
- true if the missing values may be in
the predictorsclassMissing
- true if the missing values may be in the classmissingLevel
- the percentage of missing valuesprotected boolean[] instanceWeights(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)protected boolean[] datasetIntegrity(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType, boolean predictorMissing, boolean classMissing)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)predictorMissing
- true if we know the associator can handle
(at least) moderate missing predictor valuesclassMissing
- true if we know the associator can handle
(at least) moderate missing class valuesprotected boolean[] runBasicTest(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType, int missingLevel, boolean predictorMissing, boolean classMissing, int numTrain, int numClasses, FastVector accepts)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)missingLevel
- the percentage of missing valuespredictorMissing
- true if the missing values may be in
the predictorsclassMissing
- true if the missing values may be in the classnumTrain
- the number of instances in the training setnumClasses
- the number of classesaccepts
- the acceptable string in an exceptionprotected boolean[] runBasicTest(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType, int classIndex, int missingLevel, boolean predictorMissing, boolean classMissing, int numTrain, int numClasses, FastVector accepts)
nominalPredictor
- if true use nominal predictor attributesnumericPredictor
- if true use numeric predictor attributesstringPredictor
- if true use string predictor attributesdatePredictor
- if true use date predictor attributesrelationalPredictor
- if true use relational predictor attributesmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)classIndex
- the attribute index of the classmissingLevel
- the percentage of missing valuespredictorMissing
- true if the missing values may be in
the predictorsclassMissing
- true if the missing values may be in the classnumTrain
- the number of instances in the training setnumClasses
- the number of classesaccepts
- the acceptable string in an exceptionprotected Instances makeTestDataset(int seed, int numInstances, int numNominal, int numNumeric, int numString, int numDate, int numRelational, int numClasses, int classType, boolean multiInstance) throws Exception
seed
- the random number seednumInstances
- the number of instances to generatenumNominal
- the number of nominal attributesnumNumeric
- the number of numeric attributesnumString
- the number of string attributesnumDate
- the number of date attributesnumRelational
- the number of relational attributesnumClasses
- the number of classes (if nominal class)classType
- the class type (NUMERIC, NOMINAL, etc.)multiInstance
- whether the dataset should a multi-instance datasetException
- if the dataset couldn't be generatedCheckScheme.process(Instances)
protected Instances makeTestDataset(int seed, int numInstances, int numNominal, int numNumeric, int numString, int numDate, int numRelational, int numClasses, int classType, int classIndex, boolean multiInstance) throws Exception
seed
- the random number seednumInstances
- the number of instances to generatenumNominal
- the number of nominal attributesnumNumeric
- the number of numeric attributesnumString
- the number of string attributesnumDate
- the number of date attributesnumRelational
- the number of relational attributesnumClasses
- the number of classes (if nominal class)classType
- the class type (NUMERIC, NOMINAL, etc.)classIndex
- the index of the class (0-based, -1 as last)multiInstance
- whether the dataset should a multi-instance datasetException
- if the dataset couldn't be generatedTestInstances.CLASS_IS_LAST
,
CheckScheme.process(Instances)
protected void printAttributeSummary(boolean nominalPredictor, boolean numericPredictor, boolean stringPredictor, boolean datePredictor, boolean relationalPredictor, boolean multiInstance, int classType)
nominalPredictor
- true if nominal predictor attributes are presentnumericPredictor
- true if numeric predictor attributes are presentstringPredictor
- true if string predictor attributes are presentdatePredictor
- true if date predictor attributes are presentrelationalPredictor
- true if relational predictor attributes are presentmultiInstance
- whether multi-instance is neededclassType
- the class type (NUMERIC, NOMINAL, etc.)public String getRevision()
getRevision
in interface RevisionHandler
public static void main(String[] args)
args
- the commandline parametersCopyright © 2015 University of Waikato, Hamilton, NZ. All rights reserved.