FP-Growth algorithm ppt

**study tips** · 11-05-2013, 04:35 PM

FP-Growth algorithm

Introduction

Apriori: uses a generate-and-test approach – generates candidate itemsets and tests if they are frequent
Generation of candidate itemsets is expensive(in both space and time)
Support counting is expensive
Subset checking (computationally expensive)
Multiple Database scans (I/O)
FP-Growth: allows frequent itemset discovery without candidate itemset generation. Two step approach:
Step 1: Build a compact data structure called the FP-tree
Built using 2 passes over the data-set.
Step 2: Extracts frequent itemsets directly from the FP-tree

FP-Tree Construction

FP-Tree is constructed using 2 passes over the data-set:
Pass 1:
Scan data and find support for each item.
Discard infrequent items.
Sort frequent items in decreasing order based on their support.
Use this order when building the FP-Tree, so common prefixes can be shared.

FP-Tree size

The FP-Tree usually has a smaller size than the uncompressed data - typically many transactions share items (and hence prefixes).
Best case scenario: all transactions contain the same set of items.
1 path in the FP-tree
Worst case scenario: every transaction has a unique set of items (no items in common)
Size of the FP-tree is at least as large as the original data.
Storage requirements for the FP-tree are higher - need to store the pointers between the nodes and the counters.
The size of the FP-tree depends on how the items are ordered
Ordering by decreasing support is typically used but it does not always lead to the smallest tree (it's a heuristic).

Conditional FP-Tree

The FP-Tree that would be built if we only consider transactions containing a particular itemset (and then removing that itemset from all transactions).
I Example: FP-Tree conditional on e.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Human Computer Interface : Seminar Report and PPT	seminar post	1	1,337	22-09-2017, 11:23 AM Last Post: jaseela123
	4G Broadband : Seminar Report and PPT	study tips	1	1,261	22-09-2017, 11:19 AM Last Post: jaseela123
	Software Life-Cycle Models ppt	seminar flower	1	3,852	22-09-2017, 10:54 AM Last Post: jaseela123
	PPT ON LINUX	project girl	1	1,829	21-09-2017, 03:56 PM Last Post: jaseela123
	Public Key Infrastructure (Digital Certificates and Digital Signatures) PPT	project girl	1	2,364	21-09-2017, 01:18 PM Last Post: jaseela123
	Itanium Processor : Seminar Report and PPT	seminar projects maker	1	1,052	21-09-2017, 12:46 PM Last Post: jaseela123
	Design and Analysis Of Algorithms : Seminar Report and PPT	seminar projects maker	1	1,315	21-09-2017, 12:04 PM Last Post: jaseela123
	Ranked, Efficient and Secure Keyword search over encrypted cloud data PPT	seminar post	1	814	21-09-2017, 11:55 AM Last Post: jaseela123
	Biometric Authentication PPT	project girl	1	1,109	19-09-2017, 02:32 PM Last Post: jaseela123
	Android Interface Definition Language PPT	project girl	1	1,681	19-09-2017, 10:58 AM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.