17-08-2012, 04:01 PM
The FP-Growth/Apriori Debate
1The FP-Growth.ppt (Size: 660 KB / Downloads: 30)
FP-Growth Algorithm
Association Rule Mining
Generate Frequent Itemsets
Apriori generates candidate sets
FP-Growth uses specialized data structures (no candidate sets)
Find Association Rules
Outside the scope of both FP-Growth & Apriori
Therefore, FP-Growth is a competitor to Apriori
FP-Growth Complexity
Therefore, each path in the tree will be at least partially traversed the number of items existing in that tree path (the depth of the tree path) * the number of items in the header.
Complexity of searching through all paths is then bounded by O(header_count2 * depth of tree)
Creation of a new cFP-Tree occurs also.
Algorithm Results (in English)
Candidate Generation sets exchanged for FP-Trees.
You MUST take into account all paths that contain an item set with a test item.
You CANNOT determine before a conditional FP-Tree is created if new frequent item sets will occur.
Trivial examples hide these assertions, leading to a belief that FP-Tree operates more efficiently.
FP-Growth vs. Apriori
Apriori visits each transaction when generating a new candidate sets; FP-Growth does not
Can use data structures to reduce transaction list
FP-Growth traces the set of concurrent items; Apriori generates candidate sets
FP-Growth uses more complicated data structures & mining techniques
Algorithm Analysis Results
FP-Growth IS NOT inherently faster than Apriori
Intuitively, it appears to condense data
Mining scheme requires some new work to replace candidate set generation
Recursion obscures the additional effort
FP-Growth may run faster than Apriori in circumstances
No guarantee through complexity which algorithm to use for efficiency