Projet

Général

Profil

Documentation » Historique » Version 16

François Rioult, 17/01/2011 13:06

1 3 François Rioult
h1. Documentation
2 1 François Rioult
3 4 François Rioult
KDAriane is a set of operators for data mining and machine learning, and a set of scenarios (supervised classification, missing values completion, strong emerging pattern mining, etc.). It uses Ariane as a graphical platform for designing the data streams.
4
5 1 François Rioult
h2. Installation
6 4 François Rioult
7 13 François Rioult
* [[Prerequisite]]
8 4 François Rioult
* [[KDAriane]] 
9 1 François Rioult
10 5 François Rioult
h2. Special operators for shell scripting
11 7 François Rioult
12 5 François Rioult
KDAriane is provided with basic components for executing shell scripts. The choice depends on how many parameters (p), input (i)  and output (o) you want. The operators are named 
13
@"eval" + p + i + o @ and call the eponymous .sh script.
14
15 6 François Rioult
When an operator is executed, Ariane launches the script (for example @script.sh@) associated to the operator with giving the following arguments:
16
<pre>
17
script.sh parameter-1 parameter-2 ... parameter-p input-1 input-2 ... input-i output-1 output-2 ... output-o
18
</pre>
19 5 François Rioult
20
In Ariane, every operator has a return value, even if it has no output.
21
22 9 François Rioult
The operator are divided in two categories: 
23 14 François Rioult
* [[KDD operators]] are special components for calling Weka components or RapidMiner processes.
24 9 François Rioult
* [[Shell operators]] that directly execute the commands entered by Ariane. 
25 6 François Rioult
26 15 François Rioult
h2. Pattern mining prototypes
27
28 16 François Rioult
* [[music-dfs]] : mining patterns under various constraints
29
* [[mtminer]] : levelwise minimal transversals of hypergraph
30 15 François Rioult
31 5 François Rioult
32 1 François Rioult
h2. Scenarios
33 10 François Rioult
34
KDAriane provides some examples of KDD realized through Ariane:
35 11 François Rioult
* [[Data preparation]] : a first scenario for the binarization of CSV data.
36 10 François Rioult
* pattern mining and complexity visualization
37
* supervised classification with association rules
38
* experiences about perturbation on training and test file with Weka classifiers and RapidMiner processes.