15-02-2013, 12:42 PM
Association Rule Mining using FP-Tree as DAG
1Association Rule.docx (Size: 111.57 KB / Downloads: 20)
ABSTRACT
Association rule mining is one of the most important aspects of data mining. Association Rule Mining using FP-Tree as DAG aims at searching for interesting relationships among items in a large data set or database and discovers association rules among the large no of item sets. The importance of ARM is increasing with the demand of finding frequent patterns from large data sources. Researchers developed a lot of algorithms and techniques for generating association rules. The main problem is the generation of candidate item sets before producing frequent item sets. This result in wastage of time and space. Among the existing technique the frequent pattern (FP Growth) method is the most efficient and scalable approach. It mines the frequent item set without candidate data set generation. The obstacle is it generates a massive number of conditional fp trees. In this system we propose an improvement for frequent pattern tree based technique which does not use conditional fp trees. It generates fp trees using directed acyclic graph data structure. For this we propose an algorithm that scans the database and generates fp trees as DAG so that we can generate Frequent Patterns directly using DAG without generating conditional fp trees. Using frequent patterns the association rules are generated. We compare this with traditional fp growth, MFI in terms of number of database scans, conditional FPTree, time complexity and space complexity.
Existing System
Association Rule mining discovers correlations among data items in a transactional database D and each transaction in transactional database D is a set of data items that represents purchasing by a customer. Each transaction is represented in database D by a record and database is a set of records. Association rule provides relationship among items of various records in database D and usually represented in the form A->B where A and B are item sets i.e set of items belong to different records of database D that frequently involved in transactions by customers.
Researchers have designed and implemented Association rule mining using different set of algorithms. Among these algorithms the most efficient algorithm for generating frequent Item sets (patterns) is FP growth. According to these algorithms the data is stored in file or DBMS. The data is extracted from file or DBMS and generating item sets.
Limitations
• FP growth improves the performance of item set generation that Apriori which is traditional algorithm but if also suffers limitations which accessing large database. Though FP growth does not generate candidate Item sets like Apriori, gt generates massive conditional FP trees.
• Sometimes we can use a table instead of conditional FP tree but it also consumes space and time.
• This requires memory space when dealing with large databases.
• This is time consuming and requires lot of time in generating frequent patterns.
Proposed System
In our system we are proposing a new improved FP growth algorithm that uses directed acyclic graph structure for representing FP tree data. This is also improved concept than MFI which uses FP tree and a table. The new FP growth algorithm using DAG does not requires conditional FP tree and more over the conditional FP trees are represented by DAG itself because of this we can generate frequent item sets and association rules more efficiently than the traditional FP growth and MFI based FP growth.
According to base paper the authors presented a system using MFI algorithm than generates frequent item sets without generating conditional FP trees. The authors computed no of data scans FP trees and compared with traditional FP growth. In our system we are implementing the concept using DAG and comparing with MFI based and traditional FP growth.