CMAR » Historique » Version 9
François Rioult, 19/09/2012 19:17
1 | 2 | François Rioult | h1. CMAR - Classification with multiple association rules |
---|---|---|---|
2 | 1 | François Rioult | |
3 | 9 | François Rioult | This Ariane graph implements CMAR[1], a well known supervised classification method with association rules. |
4 | 4 | François Rioult | |
5 | !https://forge.greyc.fr/attachments/195/cmar1.png! |
||
6 | 5 | François Rioult | |
7 | The parameters are the following: |
||
8 | 1. the number of validations |
||
9 | 2. the absolute minsup threshold |
||
10 | 3. the number of accepted exceptions |
||
11 | 6 | François Rioult | |
12 | The input is the well know _iris_ database, which has 3 classes (on the first column). |
||
13 | |||
14 | 7 | François Rioult | The cross-validation is handled by the _for_ loop, which will accumulate the result of the process. That is why there is a _touch_ input, which creates an empty file with the _echo -n_ command. |
15 | |||
16 | 8 | François Rioult | * the XML may be viewed by a browser |
17 | |||
18 | |||
19 | 7 | François Rioult | !https://forge.greyc.fr/attachments/196/cmar2.png! |
20 | 8 | François Rioult | |
21 | In the validation loop: |
||
22 | * the database is first split in training and testing set with the _repartition_ operator |
||
23 | * comments (lines starting with a sharp) are removed |
||
24 | * the number of classes is computed with the purple macro, which cut the first columns, sorts it and counts |
||
25 | * the classification decision is computed in the for loop on each class and is accumulated in the touched input |
||
26 | * the classification result is a set of columns, one per class, containing the vote value for each instance. It is pasted to the train set. |
||
27 | * a score operator computes various indicator in an XML file: recall, precision, score, confusion matrix, area under the ROC curve (ineffective when more than two classes) |
||
28 | * the XML is accumulated |
||
29 | |||
30 | !https://forge.greyc.fr/attachments/197/cmar3.png! |
||
31 | |||
32 | 1 | François Rioult | The model is computed as follows: |
33 | 9 | François Rioult | |
34 | * _mvminer_ computed the non redundant rules (a free or minimal or generator patterns as antecedent, its closure as consequent) |
||
35 | * the rules whose antecedent are not minimal for a given item are removed |
||
36 | * the rules concluding on the focused class are kept |
||
37 | * they are measured by a Chi square |
||
38 | * the cover of the rules over the training set are kept |
||
39 | * the rules vote for the class |
||
40 | * the result is accumulated. |
||
41 | |||
42 | h3. References |
||
43 | |||
44 | [1] Li W., Han, J. and Pei, J. (2001). CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules. Proc ICDM 2001, pp369-376. |