25-08-2017, 09:32 PM
Data Mining Technique
Data Mining.docx (Size: 14.97 KB / Downloads: 18)
Abstract
Data mining on large databases has been a major concern in research
Community, due to the difficulty of analyzing huge volumes of data
Using only traditional OLAP tools. This sort of process implies a lot of
Computational power, memory and disk I/O, which can only be provided by parallel computers. We present a discussion of how database technology can be integrated to data mining techniques. Finally, we also point out several advantages of addressing data consuming activities through a tight integration of a parallel database server and data mining techniques.
Introduction
Data mining techniques have increasingly been studied,especially in their application in real-world databases. One typical problem is that databases tend to be very large, and these techniques often repeatedly scan the entire set. Sampling has been used for a long time, but subtle differences among sets of objects become less evident.
This work provides an overview of some important data mining techniques and their applicability on large databases. We also spot several advantages of using a database management system (DBMS)
to manage and process information instead of conventional flat files.
This approach has been a major concern of several researches, be-
cause it represents a very natural solution since DBMSs have been
successfully used in business management and currently may store
valuable hidden knowledge.
Data Mining Techniques
Data mining is a step in knowledge discovery in databases (KDD) that searches for a series of hidden patterns in data, often involving a repeated iterative application of particular data mining methods.The goal of the whole KDD process is to make patterns understandable to humans in order to facilitate a better interpretation of the underlying data.
Classification
Among classification algorithms, there are two methods that are widely used for data mining purposes:
decision trees
neural networks.
Data Mining and DBMSs
Database technology has been successfully used in traditional business data processing. Companies have been gathering a large amount of data, using a DBMS system to manage it. Therefore, it is desirable that we have an easy and painless use of database technology within other areas, such as data mining.
Conclusions
Data mining and its application on large databases have been ex-tensively studied due to the increasing diffculty of analyzing large volumes of data using only OLAP tools. This diffculty pointed out the need of an automated process to discover interesting and hidden patterns in real-world data sets. The ability to handle large amounts of information has been a major concern in many recent data mining applications. Parallel processing comes to play an important role in this context, once only parallel machines can provide suffcient computational power, memory and disk I/O.