Projet

Général

Profil

CMAR » Historique » Version 9

François Rioult, 19/09/2012 19:17

1 2 François Rioult
h1. CMAR - Classification with multiple association rules
2 1 François Rioult
3 9 François Rioult
This Ariane graph implements CMAR[1], a well known supervised classification method with association rules.
4 4 François Rioult
5
!https://forge.greyc.fr/attachments/195/cmar1.png!
6 5 François Rioult
7
The parameters are the following:
8
1. the number of validations
9
2. the absolute minsup threshold
10
3. the number of accepted exceptions
11 6 François Rioult
12
The input is the well know _iris_ database, which has 3 classes (on the first column).
13
14 7 François Rioult
The cross-validation is handled by the _for_ loop, which will accumulate the result of the process. That is why there is a _touch_ input, which creates an empty file with the _echo -n_ command.
15
16 8 François Rioult
* the XML may be viewed by a browser
17
18
19 7 François Rioult
!https://forge.greyc.fr/attachments/196/cmar2.png!
20 8 François Rioult
21
In the validation loop:
22
* the database is first split in training and testing set with the _repartition_ operator
23
* comments (lines starting with a sharp) are removed
24
* the number of classes is computed with the purple macro, which cut the first columns, sorts it and counts
25
* the classification decision is computed in the for loop on each class and is accumulated in the touched input
26
* the classification result is a set of columns, one per class, containing the vote value for each instance. It is pasted to the train set.
27
* a score operator computes various indicator in an XML file: recall, precision, score, confusion matrix, area under the ROC curve (ineffective when more than two classes)
28
* the XML is accumulated
29
30
!https://forge.greyc.fr/attachments/197/cmar3.png!
31
32 1 François Rioult
The model is computed as follows:
33 9 François Rioult
34
* _mvminer_ computed the non redundant rules (a free or minimal or generator patterns as antecedent, its closure as consequent)
35
* the rules whose antecedent are not minimal for a given item are removed
36
* the rules concluding on the focused class are kept
37
* they are measured by a Chi square
38
* the cover of the rules over the training set are kept
39
* the rules vote for the class
40
* the result is accumulated.
41
42
h3. References
43
44
[1] Li W., Han, J. and Pei, J. (2001). CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules. Proc ICDM 2001, pp369-376.