CMAR - Classification with multiple association rules¶
This Ariane graph implements CMAR1, a well known supervised classification method with association rules.
The parameters are the following:
1. the number of validations
2. the absolute minsup threshold
3. the number of accepted exceptions
The input is the well know iris database, which has 3 classes (on the first column).
The cross-validation is handled by the for loop, which will accumulate the result of the process. That is why there is a touch input, which creates an empty file with the echo -n command.
- the XML may be viewed by a browser
In the validation loop:
- the database is first split in training and testing set with the repartition operator
- comments (lines starting with a sharp) are removed
- the number of classes is computed with the purple macro, which cut the first columns, sorts it and counts
- the classification decision is computed in the for loop on each class and is accumulated in the touched input
- the classification result is a set of columns, one per class, containing the vote value for each instance. It is pasted to the train set.
- a score operator computes various indicator in an XML file: recall, precision, score, confusion matrix, area under the ROC curve (ineffective when more than two classes)
- the XML is accumulated
The model is computed as follows:
- mvminer computed the non redundant rules (a free or minimal or generator patterns as antecedent, its closure as consequent)
- the rules whose antecedent are not minimal for a given item are removed
- the rules concluding on the focused class are kept
- they are measured by a Chi square
- the cover of the rules over the training set are kept
- the rules vote for the class
- the result is accumulated.
 Li W., Han, J. and Pei, J. (2001). CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules. Proc ICDM 2001, pp369-376.