CMAR » Historique » Version 9
  François Rioult, 19/09/2012 19:17 
  
| 1 | 2 | François Rioult | h1. CMAR - Classification with multiple association rules | 
|---|---|---|---|
| 2 | 1 | François Rioult | |
| 3 | 9 | François Rioult | This Ariane graph implements CMAR[1], a well known supervised classification method with association rules. | 
| 4 | 4 | François Rioult | |
| 5 | !https://forge.greyc.fr/attachments/195/cmar1.png! | ||
| 6 | 5 | François Rioult | |
| 7 | The parameters are the following: | ||
| 8 | 1. the number of validations | ||
| 9 | 2. the absolute minsup threshold | ||
| 10 | 3. the number of accepted exceptions | ||
| 11 | 6 | François Rioult | |
| 12 | The input is the well know _iris_ database, which has 3 classes (on the first column). | ||
| 13 | |||
| 14 | 7 | François Rioult | The cross-validation is handled by the _for_ loop, which will accumulate the result of the process. That is why there is a _touch_ input, which creates an empty file with the _echo -n_ command. | 
| 15 | |||
| 16 | 8 | François Rioult | * the XML may be viewed by a browser | 
| 17 | |||
| 18 | |||
| 19 | 7 | François Rioult | !https://forge.greyc.fr/attachments/196/cmar2.png! | 
| 20 | 8 | François Rioult | |
| 21 | In the validation loop: | ||
| 22 | * the database is first split in training and testing set with the _repartition_ operator | ||
| 23 | * comments (lines starting with a sharp) are removed | ||
| 24 | * the number of classes is computed with the purple macro, which cut the first columns, sorts it and counts | ||
| 25 | * the classification decision is computed in the for loop on each class and is accumulated in the touched input | ||
| 26 | * the classification result is a set of columns, one per class, containing the vote value for each instance. It is pasted to the train set. | ||
| 27 | * a score operator computes various indicator in an XML file: recall, precision, score, confusion matrix, area under the ROC curve (ineffective when more than two classes) | ||
| 28 | * the XML is accumulated | ||
| 29 | |||
| 30 | !https://forge.greyc.fr/attachments/197/cmar3.png! | ||
| 31 | |||
| 32 | 1 | François Rioult | The model is computed as follows: | 
| 33 | 9 | François Rioult | |
| 34 | * _mvminer_ computed the non redundant rules (a free or minimal or generator patterns as antecedent, its closure as consequent) | ||
| 35 | * the rules whose antecedent are not minimal for a given item are removed | ||
| 36 | * the rules concluding on the focused class are kept | ||
| 37 | * they are measured by a Chi square | ||
| 38 | * the cover of the rules over the training set are kept | ||
| 39 | * the rules vote for the class | ||
| 40 | * the result is accumulated. | ||
| 41 | |||
| 42 | h3. References | ||
| 43 | |||
| 44 | [1] Li W., Han, J. and Pei, J. (2001). CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules. Proc ICDM 2001, pp369-376. |