Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: METHODS OF IMPROVEMENT IN ASSOCIATION MINING ALGORITHM: IAA report
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
METHODS OF IMPROVEMENT IN ASSOCIATION MINING ALGORITHM: IAA

[attachment=45126]


Abstract

With the development of Information Technologies there is requirement of managing informations .Information technology has paid more attention by the people. Data Mining is to organize and analyze data from a huge amount of information to discover useful data. So to apply DM there is need of implementation for knowledge discovery. We applied DM algorithm named Apriori which is an efficient association rule mining algorithm. The name of the algorithm is based on the fact that the algorithm uses prior knowledge of frequent itemset properties But there are some shortcomings in this algorithm. This paper describes the methods of improving efficiency of Apriori algorithm for a particular database.

Introduction

Data Mining


Data mining is the process of discovering meaningful new correlations, patterns and trends by sifting through large amounts of data stored in repositories, using pattern recognition technologies as well as statistical and mathematical techniques.” DM is the analysis of observational data sets to find unsuspected relationships and to summarize the data in novel ways that are both understandable and useful to the data owner. DM is an interdisciplinary field bringing together techniques fromMachine learning, pattern recognition, statistics, databases, and visualization to address the issue of information extraction from large data bases.

Association Rule Mining(ARM)

The methodology for Association Rule Mining in QuenchMiner™ is outlined in Figure 1. Each component is explained below.
• Experimental Data: This is the raw data in the domain representing the input conditions and observed results of experiments.
• Integrated Database: This is the common database into which all the relevant data is extracted for mining.
• Related Literature: This refers to research papers and other relevant documents forming text sources of domain knowledge.
• Structured Text Base: This refers to the integrated repository of structured text extracted from the related literature using the
domain-specific tags for relevant entities.
• Conversion: This refers to the preprocessing involved in converting the information into the format required for data mining.

Apriori Algorithm concepts

Apriori is an efficient association rule mining
algorithm, developed by Agrawal et al.[1]. The name of the algorithm is based on the fact that the algorithm uses
prior knowledge of frequent itemset properties
Apriori algorithm is one of the most important algorithm[2,3] to mine the frequent itemsets of association rules. it is an approach based on two stage frequency, the design of association rule mining algorithm can be decomposed into two sub issues:
(1) Find all the item sets with support greater than minimum support called as frequent item.
(2) Based on finded sets in (1) all the AR will be generated and for each frequent item set A, all the subset a of A will be found if ratio of support(A)/support(a)>=min confidence, to generate the association rules A-a. This algorthm called two sub-processes which are Apriori-gen() and subset(). Apriori-gen() produces a candidate ,then use the Apriori property to delete candidates of non-frequent subsets.Once generated allthe candidates scan the database and for each transaction use the Subset()to identify all the candidate subsets. then all candidates met the minimum support from frequent item set.