06-09-2016, 12:44 PM
1453266989-mining.pptABUSHIVANEWWWWWWWWWWW.pptx (Size: 308.1 KB / Downloads: 9)
WHY DATA MINING
The Explosive Growth of Data
○Data collection and data availability
○Automated data collection tools, database systems, Web, computerized society
●Major sources of abundant data
○Business: Web, e-commerce, transactions, stocks, …
○Science: Remote sensing, bioinformatics, scientific simulation, …
○Society and everyone: news, digital cameras,
○We are drowning in data, but starving for knowledge!
○“Necessity is the mother of invention”—Data mining—Automated analysis of massive data sets
EVOLUTION OF DATABASE TECHNOLOGY
1960s:
●Data collection, database creation, IMS and network DBMS
○1970s:
●Relational data model, relational DBMS implementation
○1980s:
●RDBMS, advanced data models (extended-relational, OO, deductive, etc.)
●Application-oriented DBMS (spatial, scientific, engineering, etc.)
○1990s:
●Data mining, data warehousing, multimedia databases, and Web databases
○2000s
●Stream data management and mining
●Data mining and its applications
●Web technology (XML, data integration) and global information systems
WHAT IS DATA MINING?
Data mining (knowledge discovery from data)
●Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data
●Data mining: a misnomer?
○Alternative names
●Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc.
○Watch out: Is everything “data mining”?
●Simple search and query processing
●(Deductive) expert systems
Data analysis and decision support
●Market analysis and management
○Target marketing, customer relationship management (CRM), market basket analysis, cross selling, market segmentation
●Risk analysis and management
○Forecasting, customer retention, improved underwriting, quality control, competitive analysis
●Fraud detection and detection of unusual patterns (outliers)
○Other Applications
●Text mining (news group, email, documents) and Web mining
●Stream data mining
●Bioinformatics and bio-data analysis
CORPORATE ANALYSIS & RISK MANAGEMENT
Finance planning and asset evaluation
●cash flow analysis and prediction
●contingent claim analysis to evaluate assets
●cross-sectional and time series analysis (financial-ratio, trend analysis, etc.)
○Resource planning
●summarize and compare the resources and spending
○Competition
●monitor competitors and market directions
●group customers into classes and a class-based pricing procedure
●set pricing strategy in a highly competitive market
DATA MINING: ON WHAT KINDS OF
DATA?
Database-oriented data sets and applications
●Relational database, data warehouse, transactional database
○Advanced data sets and advanced applications
●Data streams and sensor data
●Time-series data, temporal data, sequence data (incl. bio-sequences)
●Structure data, graphs, social networks and multi-linked data
●Object-relational databases
●Heterogeneous databases and legacy databases
●Spatial data and spatiotemporal data
●Multimedia database
●Text databases
●The World-Wide Web