17-08-2012, 04:18 PM
Decision Support, Data Warehousing, and OLAP
DataWarehouse.ppt (Size: 1,020 KB / Downloads: 26)
Decision Support and OLAP
Information technology to help the knowledge worker (executive, manager, analyst) make faster and better decisions.
What were the sales volumes by region and product category for the last year?
How did the share price of computer manufacturers correlate with quarterly profits over the past 10 years?
Which orders should we fill to maximize revenues?
Will a 10% discount increase sales volume sufficiently?
Which of two new medications will result in the best outcome: higher recovery rate & shorter hospital stay?
On-Line Analytical Processing (OLAP) is an element of decision support systmes (DSS).
Why Separate Data Warehouse?
Performance
Op dbs designed & tuned for known txs & workloads.
Complex OLAP queries would degrade perf. For op txs.
Special data organization, access & implementation methods needed for multidimensional views & queries.
Three-Tier Architecture
Warehouse database server
Almost always a relational DBMS; rarely flat files
OLAP servers
Relational OLAP (ROLAP): extended relational DBMS that maps operations on multidimensional data to standard relational operations.
Multidimensional OLAP (MOLAP): special purpose server that directly implements multidimensional data and operations.
Clients
Query and reporting tools.
Analysis tools
Data mining tools (e.g., trend analysis, prediction)
Data Warehouse vs. Data Marts
Enterprise warehouse: collects all information about subjects (customers, products, sales, assets, personnel) that span the entire organization.
Requires extensive business modeling
May take years to design and build
Data Marts: Departmental subsets that focus on selected subjects: Marketing data mart: customer, products, sales.
Faster roll out, but complex integration in the long run.
Virtual warehouse: views over operational dbs
Materialize some summary views for efficient query processing
Easier to build
Requisite excess capcaity on operational db servers
Design & Operational Process
Define architecture. Do capacity planning.
Integrate db and OLAP servers, storage and client tools.
Design warehouse schema, views.
Design physical warehouse organization: data placement, partitioning, access methods.
Connect sources: gateways, ODBC drivers, wrappers.
Design & implement scripts for data extract, load refresh.
Define metadata and populate repository.
Design & implement end-user applications.
Roll out warehouse and applications.
Monitor the warehouse.
Multidimensional Data Model
Database is a set of facts (points) in a multidimensional space
A fact has a measure dimension
quantity that is analyzed, e.g., sale, budget
A set of dimensions on which data is analyzed
e.g. , store, product, date associated with a sale amount
Dimensions form a sparsely populated coordinate system
Each dimension has a set of attributes
e.g., owner city and county of store
Attributes of a dimension may be related by partial order
Hierarchy: e.g., street > county >city
Lattice: e.g., date> month>year, date>week>year
DataWarehouse.ppt (Size: 1,020 KB / Downloads: 26)
Decision Support and OLAP
Information technology to help the knowledge worker (executive, manager, analyst) make faster and better decisions.
What were the sales volumes by region and product category for the last year?
How did the share price of computer manufacturers correlate with quarterly profits over the past 10 years?
Which orders should we fill to maximize revenues?
Will a 10% discount increase sales volume sufficiently?
Which of two new medications will result in the best outcome: higher recovery rate & shorter hospital stay?
On-Line Analytical Processing (OLAP) is an element of decision support systmes (DSS).
Why Separate Data Warehouse?
Performance
Op dbs designed & tuned for known txs & workloads.
Complex OLAP queries would degrade perf. For op txs.
Special data organization, access & implementation methods needed for multidimensional views & queries.
Three-Tier Architecture
Warehouse database server
Almost always a relational DBMS; rarely flat files
OLAP servers
Relational OLAP (ROLAP): extended relational DBMS that maps operations on multidimensional data to standard relational operations.
Multidimensional OLAP (MOLAP): special purpose server that directly implements multidimensional data and operations.
Clients
Query and reporting tools.
Analysis tools
Data mining tools (e.g., trend analysis, prediction)
Data Warehouse vs. Data Marts
Enterprise warehouse: collects all information about subjects (customers, products, sales, assets, personnel) that span the entire organization.
Requires extensive business modeling
May take years to design and build
Data Marts: Departmental subsets that focus on selected subjects: Marketing data mart: customer, products, sales.
Faster roll out, but complex integration in the long run.
Virtual warehouse: views over operational dbs
Materialize some summary views for efficient query processing
Easier to build
Requisite excess capcaity on operational db servers
Design & Operational Process
Define architecture. Do capacity planning.
Integrate db and OLAP servers, storage and client tools.
Design warehouse schema, views.
Design physical warehouse organization: data placement, partitioning, access methods.
Connect sources: gateways, ODBC drivers, wrappers.
Design & implement scripts for data extract, load refresh.
Define metadata and populate repository.
Design & implement end-user applications.
Roll out warehouse and applications.
Monitor the warehouse.
Multidimensional Data Model
Database is a set of facts (points) in a multidimensional space
A fact has a measure dimension
quantity that is analyzed, e.g., sale, budget
A set of dimensions on which data is analyzed
e.g. , store, product, date associated with a sale amount
Dimensions form a sparsely populated coordinate system
Each dimension has a set of attributes
e.g., owner city and county of store
Attributes of a dimension may be related by partial order
Hierarchy: e.g., street > county >city
Lattice: e.g., date> month>year, date>week>year