Main Page | Namespace List | Class List | File List | Class Members | File Members

Apriori_Trie Class Reference

Apriori_Trie (or prefix-tree) is a tree-based datastructure. More...

#include <Apriori_Trie.hpp>

Collaboration diagram for Apriori_Trie:

[legend]
List of all members.

Public Member Functions

 Apriori_Trie (const unsigned long counter_of_emptyset)
void insert_frequent_items (const vector< unsigned long > &counters)
 Insert the frequent items and their counters into the trie;.

void candidate_generation (const itemtype &frequent_size, Input_Output_Manager &input_output_manager)
 Generates candidates.

void find_candidate (const vector< itemtype > &basket, const itemtype candidate_size, const unsigned long counter=1)
 Increases the counter of those candidates that are contained by the given basket.

void delete_infrequent (const unsigned long min_occurrence, const itemtype candidate_size)
 Deletes unfrequent itemsets.

bool is_there_any_candidate () const
void statistics () const
 ~Apriori_Trie ()

Protected Member Functions

bool is_all_subset_frequent (const set< itemtype > &maybe_candidate) const
 Decides if all subset of an itemset is contained in the Apriori_Trie.

void candidate_generation_two ()
 Generates candidate of size two.

void candidate_generation_assist (Trie *Trie, const itemtype distance_from_generator, set< itemtype > &maybe_candidate, Input_Output_Manager &input_output_manager)
 Generates candidate of size more than two.

void find_candidate_two (const vector< itemtype > &basket, const unsigned long counter=1)
 Increases the counter for those itempairs that are in the given basket.

void delete_infrequent_two (const unsigned long min_occurrence)
 Deletes the Tries that represent infrequent itemsets of size 2.


Protected Attributes

Trie main_trie
 Trie to store the candidates and the frequent itemsets.

vector< vector< unsigned long > > temp_counter_array
 temp_counter_array stores the occurences of the itempairs

vector< unsigned long > nr_of_freq_itemsets

Detailed Description

Apriori_Trie (or prefix-tree) is a tree-based datastructure.

Apriori_Trie is a special tree designed to provide efficient methods for the apriori algorithm. It mostly uses a regular trie except when there exist faster solution. For example for storing one and two itemset candidate where a simple vector and array gives better performance. Apriori_Trie extends the functions provided by the regular trie with a candidate generation process.

Definition at line 33 of file Apriori_Trie.hpp.


Constructor & Destructor Documentation

Apriori_Trie::Apriori_Trie const unsigned long  counter_of_emptyset  ) 
 

Parameters:
counter_of_emptyset The support of the empty set, i.e. the number of transactions.

Definition at line 19 of file Apriori_Trie.cpp.

References nr_of_freq_itemsets.

Apriori_Trie::~Apriori_Trie  ) 
 

Definition at line 106 of file Apriori_Trie.cpp.


Member Function Documentation

void Apriori_Trie::candidate_generation const itemtype frequent_size,
Input_Output_Manager input_output_manager
 

Generates candidates.

Parameters:
frequent_size Size of the frequent itemsets that generate the candidates.

Definition at line 42 of file Apriori_Trie.cpp.

References candidate_generation_assist(), candidate_generation_two(), itemtype, and main_trie.

Referenced by Apriori::APRIORI_alg().

void Apriori_Trie::candidate_generation_assist Trie Trie,
const itemtype  distance_from_generator,
set< itemtype > &  maybe_candidate,
Input_Output_Manager input_output_manager
[protected]
 

Generates candidate of size more than two.

Definition at line 153 of file Apriori_Trie.cpp.

References Trie::add_empty_state(), Trie::counter, Trie::edgevector, is_all_subset_frequent(), itemtype, and Input_Output_Manager::write_out_basket_and_counter().

Referenced by candidate_generation().

void Apriori_Trie::candidate_generation_two  )  [protected]
 

Generates candidate of size two.

Definition at line 136 of file Apriori_Trie.cpp.

References Trie::edgevector, main_trie, and temp_counter_array.

Referenced by candidate_generation().

void Apriori_Trie::delete_infrequent const unsigned long  min_occurrence,
const itemtype  candidate_size
 

Deletes unfrequent itemsets.

Parameters:
min_occurrence The threshold of absolute support.
candidate_size The size of the candidate itemset.

Definition at line 79 of file Apriori_Trie.cpp.

References Trie::delete_infrequent(), delete_infrequent_two(), itemtype, main_trie, and nr_of_freq_itemsets.

Referenced by Apriori::APRIORI_alg().

void Apriori_Trie::delete_infrequent_two const unsigned long  min_occurrence  )  [protected]
 

Deletes the Tries that represent infrequent itemsets of size 2.

Parameters:
min_occurrence The occurence threshold

temp_counter_array[stateIndex_1] will never be used again!

temp_counter_array will never be used again!

Definition at line 236 of file Apriori_Trie.cpp.

References Trie::edgevector, main_trie, nr_of_freq_itemsets, and temp_counter_array.

Referenced by delete_infrequent().

void Apriori_Trie::find_candidate const vector< itemtype > &  basket,
const itemtype  candidate_size,
const unsigned long  counter_incr = 1
 

Increases the counter of those candidates that are contained by the given basket.

Parameters:
basket The basket that hs to be analyzed.
candidate_size the size of the candidates.
counter_incr The number of time the basket occured. The counters of candidates that occure in the basket has to be incremented by counter_incr.

Definition at line 62 of file Apriori_Trie.cpp.

References Trie::find_candidate(), find_candidate_two(), itemtype, and main_trie.

Referenced by Apriori::support().

void Apriori_Trie::find_candidate_two const vector< itemtype > &  basket,
const unsigned long  counter = 1
[protected]
 

Increases the counter for those itempairs that are in the given basket.

Parameters:
basket the given basket
counter The number the processed basket occures in the transactional database

Definition at line 216 of file Apriori_Trie.cpp.

References temp_counter_array.

Referenced by find_candidate().

void Apriori_Trie::insert_frequent_items const vector< unsigned long > &  counters  ) 
 

Insert the frequent items and their counters into the trie;.

Parameters:
counters It stores the support of the items. counters[i] stores the suport of item i.

Definition at line 29 of file Apriori_Trie.cpp.

References Trie::add_empty_state(), main_trie, and nr_of_freq_itemsets.

Referenced by Apriori::APRIORI_alg().

bool Apriori_Trie::is_all_subset_frequent const set< itemtype > &  maybe_candidate  )  const [protected]
 

Decides if all subset of an itemset is contained in the Apriori_Trie.

Parameters:
maybe_candidate The itemset that has to be checked.

Definition at line 114 of file Apriori_Trie.cpp.

References Trie::is_included(), and main_trie.

Referenced by candidate_generation_assist().

bool Apriori_Trie::is_there_any_candidate  )  const
 

Definition at line 92 of file Apriori_Trie.cpp.

References Trie::edgevector, and main_trie.

Referenced by Apriori::APRIORI_alg().

void Apriori_Trie::statistics  )  const
 

Definition at line 97 of file Apriori_Trie.cpp.

References nr_of_freq_itemsets.

Referenced by Apriori::APRIORI_alg().


Member Data Documentation

Trie Apriori_Trie::main_trie [protected]
 

Trie to store the candidates and the frequent itemsets.

Definition at line 89 of file Apriori_Trie.hpp.

Referenced by candidate_generation(), candidate_generation_two(), delete_infrequent(), delete_infrequent_two(), find_candidate(), insert_frequent_items(), is_all_subset_frequent(), and is_there_any_candidate().

vector<unsigned long> Apriori_Trie::nr_of_freq_itemsets [protected]
 

Definition at line 98 of file Apriori_Trie.hpp.

Referenced by Apriori_Trie(), delete_infrequent(), delete_infrequent_two(), insert_frequent_items(), and statistics().

vector< vector<unsigned long> > Apriori_Trie::temp_counter_array [protected]
 

temp_counter_array stores the occurences of the itempairs

We can use a simple array to determine the support of itemset of size two. This requires less memory than the trie-based supportcount. temp_counter_array[i][j-i] stores the occurence of the itempair (i,j).

Definition at line 97 of file Apriori_Trie.hpp.

Referenced by candidate_generation_two(), delete_infrequent_two(), and find_candidate_two().


The documentation for this class was generated from the following files:
Generated on Fri Sep 3 17:23:52 2004 for APRIORI algorithm by doxygen 1.3.5