31-08-2017, 12:45 PM
Data Mining Techniques
Data mining techniques There are several major data mining techniques that have been developed and used in data mining projects recently including association, classification, clustering, prediction, sequential patterns and decision tree. We will briefly review the data mining techniques in the following sections.
Association
The association is one of the most well-known data mining techniques. In association, a pattern is discovered based on a relationship between elements in the same transaction. That is why the technique of association is also known as relationship technique. The association technique is used in the analysis of the market basket to identify a set of products that customers frequently buy together.
Retailers are using the partnership technique to investigate customers' buying habits. Based on historical sales data, retailers may find that customers always buy chips when they buy beers and, therefore, can put beers and chips side by side to save time for the customer and increase sales.
[b]Classification[/b]
Classification is a classical data mining technique based on automatic learning. Basically, sorting is used to classify each element into a data set into one of a predefined set of classes or groups. The classification method makes use of mathematical techniques such as decision trees, linear programming, neural network and statistics. In the classification, we develop software that can learn how to classify data elements into groups. For example, we can apply the classification in the application that "given all records of employees who left the company, predict who will likely leave the company in a future period." In this case, we split the employee records into two groups that named "Leave" and "Stay". And then we can ask our data mining software to sort employees into separate groups.
Clustering
Clustering is a data mining technique that makes a significant or useful set of objects having similar characteristics using the automatic technique. The grouping technique defines the classes and places objects in each class, whereas in classification techniques, objects are assigned to predefined classes. To make the concept clearer, we can take management of books in the library as an example. In a library, there is a wide range of books on various subjects available. The challenge is how to keep those books in a way that readers can take several books on a particular subject without hassle. By using the grouping technique, we can keep books that have some kind of similarities in a group or a shelf and label it with a meaningful name. If readers want to grab books on that subject, they would just have to go to that bookcase instead of looking for the whole library.
Prediction
Prediction, as the name implies, is one of the data mining techniques that uncovers the relationship between independent variables and the relationship between dependent and independent variables. For example, the predictive analysis technique can be used in the sale to predict the benefit for the future if we consider that the sale is an independent variable, the benefit could be a dependent variable. Then, based on the historical data of sales and profits, we can draw an adjusted regression curve that is used for the prediction of utilities.
[b]Sequential patterns[/b]
Sequential pattern analysis is a data mining technique that seeks to discover or identify similar patterns, regular events, or trends in transaction data over a period of time.
In sales, with historical transaction data, companies can identify a set of items that customers buy together at different times in a year. Then companies can use this information to recommend customers buy with better deals depending on their frequency of purchase in the past.
Data mining techniques There are several major data mining techniques that have been developed and used in data mining projects recently including association, classification, clustering, prediction, sequential patterns and decision tree. We will briefly review the data mining techniques in the following sections.
Association
The association is one of the most well-known data mining techniques. In association, a pattern is discovered based on a relationship between elements in the same transaction. That is why the technique of association is also known as relationship technique. The association technique is used in the analysis of the market basket to identify a set of products that customers frequently buy together.
Retailers are using the partnership technique to investigate customers' buying habits. Based on historical sales data, retailers may find that customers always buy chips when they buy beers and, therefore, can put beers and chips side by side to save time for the customer and increase sales.
[b]Classification[/b]
Classification is a classical data mining technique based on automatic learning. Basically, sorting is used to classify each element into a data set into one of a predefined set of classes or groups. The classification method makes use of mathematical techniques such as decision trees, linear programming, neural network and statistics. In the classification, we develop software that can learn how to classify data elements into groups. For example, we can apply the classification in the application that "given all records of employees who left the company, predict who will likely leave the company in a future period." In this case, we split the employee records into two groups that named "Leave" and "Stay". And then we can ask our data mining software to sort employees into separate groups.
Clustering
Clustering is a data mining technique that makes a significant or useful set of objects having similar characteristics using the automatic technique. The grouping technique defines the classes and places objects in each class, whereas in classification techniques, objects are assigned to predefined classes. To make the concept clearer, we can take management of books in the library as an example. In a library, there is a wide range of books on various subjects available. The challenge is how to keep those books in a way that readers can take several books on a particular subject without hassle. By using the grouping technique, we can keep books that have some kind of similarities in a group or a shelf and label it with a meaningful name. If readers want to grab books on that subject, they would just have to go to that bookcase instead of looking for the whole library.
Prediction
Prediction, as the name implies, is one of the data mining techniques that uncovers the relationship between independent variables and the relationship between dependent and independent variables. For example, the predictive analysis technique can be used in the sale to predict the benefit for the future if we consider that the sale is an independent variable, the benefit could be a dependent variable. Then, based on the historical data of sales and profits, we can draw an adjusted regression curve that is used for the prediction of utilities.
[b]Sequential patterns[/b]
Sequential pattern analysis is a data mining technique that seeks to discover or identify similar patterns, regular events, or trends in transaction data over a period of time.
In sales, with historical transaction data, companies can identify a set of items that customers buy together at different times in a year. Then companies can use this information to recommend customers buy with better deals depending on their frequency of purchase in the past.