16-01-2013, 04:49 PM
Data warehouse and OLAP
1Data warehouse.ppt (Size: 467 KB / Downloads: 26)
Evolution of Database Technology
1970s - early 1980s:
Data Base Management Systems
Hieratical and network database systems
Relational database Systems
Query languages: SQL
Transactions, concurrency control and recovery.
On-line transaction processing (OLTP)
Mid -1980s - present:
Advanced data models
Extended relational, object-relational
Advanced application-oriented DBMS
spatial, scientific, engineering, temporal, multimedia, active, stream and sensor, knowledge-based
What Is Data Mining?
Data mining refers to extracting or mining knowledge from large amounts of data.
Mining of gold from rocks or sand
Knowledge mining from data, knowledge extraction, data/pattern analysis, data archeology, and data dreding.
Knowledge Discovery from data, or KDD
Steps of a KDD Process
Learning the application domain:
relevant prior knowledge and goals of application
Creating a target data set: data selection
Data cleaning and preprocessing
Data reduction and transformation:
Find useful features, dimensionality/variable reduction, invariant representation.
Steps of a KDD Process
Choosing functions of data mining
summarization, classification, regression, association, clustering.
Choosing the mining algorithms
Data mining: search for patterns of interest
Pattern evaluation and knowledge presentation
visualization, transformation, removing redundant patterns, etc.
Use of discovered knowledge
Data Mining: On What Kind of Data?
Advanced DB and information repositories
Object-oriented and object-relational databases
Spatial databases
Time-series data and temporal data
Text databases and multimedia databases
Heterogeneous and legacy databases
WWW
Data Mining Functionalities
Concept description: Characterization and discrimination
Data can be associated with classes or concepts
Ex. AllElectronics store classes of items for sale include computer and printers.
Description of class or concept called class/concept description.
Data characterization
Data discrimination
Data Mining: Classification Schemes
Databases to be mined
Relational, transactional, object-oriented, object-relational, active, spatial, time-series, text, multi-media, heterogeneous, legacy, WWW, etc.
Knowledge to be mined
Characterization, discrimination, association, classification, clustering, trend, deviation and outlier analysis, etc.
Multiple/integrated functions and mining at multiple levels
analysis, Web mining, Weblog analysis, etc.