02-10-2012, 01:01 PM
DATA MINING CONCEPTS AND TECHNIQUES
VIN datamining.doc (Size: 154 KB / Downloads: 35)
ABSTRACT
Data Mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Emerging technologies have enabled many companies to collect a tremendous amount of data on their customers, suppliers and partners. Deal with the very large database in supply chain management is a very important problem. OLAP technologies have tremendous capabilities for navigating massive data warehouses.
Data Mining involves Data Warehousing, Data Cleaning, Data Description and Visualization, Data Analysis and Interpretation. Data Warehouse maintains a central repository of all organizational data. It uses the OLAP server to enable the end user business model. It contains exploration, model building validation and deployment stages. Classes, clusters, associations, sequential patterns are the relationships in working process.
INTRODUCTION
Data Mining, the extraction of hidden predictive information from large databases, is a new technology with great potential to help companies focus on the information in their data warehouses. Data mining techniques can be implemented rapidly on existing software and hardware platforms to enhance the value of existing information resources, and can be integrated with new products.
Data Warehousing is a process of centralized data management and retrieval. Centralization of data is needed to maximize user access and analysis. Warehouses optimize database query and reporting tools because of their ability to analyze data, often from disparate databases. Data Warehouses are read-only, integrated databases designed to answer comparative questions.
ARCHITECTURE
Data mining tools currently operate outside of the warehouse, requiring extra steps for extracting, importing, and analyzing the data.
Data warehouse containing a combination of internal data tracking all customer contact coupled with external market data about competitor activity.
An OLAP (On-Line Analytical Processing) server enables a more sophisticated end-user business model to be applied when navigating the data warehouse. OLAP refers to technology
ELEMENTS
• Extract, transform, and load transaction data onto the data warehouse system.
• Store and manage the data in a multidimensional database system.
• Provide data access to business analysts and information technology professionals.
• Analyze the data by application software.
• Present the data in a useful format, such as a graph or table.
Data Reduction
Data Reduction is applied to projects to aggregate the information contained in large datasets into manageable information. Data reduction methods can include simple tabulation, aggregation like clustering.
Deployment
Deployment refers to the application of a model for classification to new data. For example, a credit card company may want to deploy a trained model to quickly identify transactions.
Machine Learning
Machine Learning denotes the application of generic model-fitting. It gives accuracy of prediction, regardless of whether or not the models are interpretable. This type of technique is applied to neural networks and boosting.
Meta Learning
Meta Learning is applied to combine the predictions from multiple models. It is particularly useful when the types of models included in the project are very different.
NEURAL NETWORKS
Neural Networks are analytic techniques modeled after the processes of learning in the cognitive system, neurological functions of the brain and capable of predicting new observations from other observations. It is called learning from existing data.
CONCLUSION AND FUTURE WORK
In this paper we have discussed how Data mining and Warehousing helps to overcome growing gap between more powerful storage and retrieval systems and the users' ability to effectively analyze and act on the information. Data Mining tool is used to structure and prioritize information for specific end-user problems.
Both relational and OLAP technologies have tremendous capabilities for navigating massive data warehouses. OLAP facilities integrated into corporate database systems to monitor the performance of the business. Hence Data Mining and warehouses is applied to automated prediction of trends and behaviors and automated discovery of previously unknown patterns. The complexity of data mining must be hidden from end-users to design business use cases with tight constrains, around data mining algorithms
VIN datamining.doc (Size: 154 KB / Downloads: 35)
ABSTRACT
Data Mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Emerging technologies have enabled many companies to collect a tremendous amount of data on their customers, suppliers and partners. Deal with the very large database in supply chain management is a very important problem. OLAP technologies have tremendous capabilities for navigating massive data warehouses.
Data Mining involves Data Warehousing, Data Cleaning, Data Description and Visualization, Data Analysis and Interpretation. Data Warehouse maintains a central repository of all organizational data. It uses the OLAP server to enable the end user business model. It contains exploration, model building validation and deployment stages. Classes, clusters, associations, sequential patterns are the relationships in working process.
INTRODUCTION
Data Mining, the extraction of hidden predictive information from large databases, is a new technology with great potential to help companies focus on the information in their data warehouses. Data mining techniques can be implemented rapidly on existing software and hardware platforms to enhance the value of existing information resources, and can be integrated with new products.
Data Warehousing is a process of centralized data management and retrieval. Centralization of data is needed to maximize user access and analysis. Warehouses optimize database query and reporting tools because of their ability to analyze data, often from disparate databases. Data Warehouses are read-only, integrated databases designed to answer comparative questions.
ARCHITECTURE
Data mining tools currently operate outside of the warehouse, requiring extra steps for extracting, importing, and analyzing the data.
Data warehouse containing a combination of internal data tracking all customer contact coupled with external market data about competitor activity.
An OLAP (On-Line Analytical Processing) server enables a more sophisticated end-user business model to be applied when navigating the data warehouse. OLAP refers to technology
ELEMENTS
• Extract, transform, and load transaction data onto the data warehouse system.
• Store and manage the data in a multidimensional database system.
• Provide data access to business analysts and information technology professionals.
• Analyze the data by application software.
• Present the data in a useful format, such as a graph or table.
Data Reduction
Data Reduction is applied to projects to aggregate the information contained in large datasets into manageable information. Data reduction methods can include simple tabulation, aggregation like clustering.
Deployment
Deployment refers to the application of a model for classification to new data. For example, a credit card company may want to deploy a trained model to quickly identify transactions.
Machine Learning
Machine Learning denotes the application of generic model-fitting. It gives accuracy of prediction, regardless of whether or not the models are interpretable. This type of technique is applied to neural networks and boosting.
Meta Learning
Meta Learning is applied to combine the predictions from multiple models. It is particularly useful when the types of models included in the project are very different.
NEURAL NETWORKS
Neural Networks are analytic techniques modeled after the processes of learning in the cognitive system, neurological functions of the brain and capable of predicting new observations from other observations. It is called learning from existing data.
CONCLUSION AND FUTURE WORK
In this paper we have discussed how Data mining and Warehousing helps to overcome growing gap between more powerful storage and retrieval systems and the users' ability to effectively analyze and act on the information. Data Mining tool is used to structure and prioritize information for specific end-user problems.
Both relational and OLAP technologies have tremendous capabilities for navigating massive data warehouses. OLAP facilities integrated into corporate database systems to monitor the performance of the business. Hence Data Mining and warehouses is applied to automated prediction of trends and behaviors and automated discovery of previously unknown patterns. The complexity of data mining must be hidden from end-users to design business use cases with tight constrains, around data mining algorithms