07-11-2012, 01:10 PM
The Data Warehouse
1The Data Warehouse.ppt (Size: 1.68 MB / Downloads: 34)
Stores, Warehouses and Marts
A data warehouse is a collection of integrated databases designed to support a DSS.
An operational data store (ODS) stores data for a specific application. It feeds the data warehouse a stream of desired raw data.
A data mart is a lower-cost, scaled-down version of a data warehouse, usually designed to support a small group of users (rather than the entire firm).
The metadata is information that is kept about the warehouse.
Data Warehousing -- It is a process
Technique for assembling and managing data from various sources for the purpose of answering business questions. Thus making decisions that were not previous possible
A decision support database maintained separately from the organization’s operational database
Characteristics of a Data Warehouse
Subject oriented – organized based on use
Integrated – inconsistencies removed
Nonvolatile – stored in read-only format
Time variant – data are normally time series
Summarized – in decision-usable format
Large volume – data sets are quite large
Metadata – data about data are stored
Data sources – comes from nonintegrated sources
Data Have Data -- The Metadata
The name suggests some high-level technological concept, but it really is fairly simple. Metadata is “data about data”.
With the emergence of the data warehouse as a decision support structure, the metadata are considered as much a resource as the business data they describe.
Metadata are abstractions -- they are high level data that provide concise descriptions of lower-level data.
The Need for Consistency in the Metadata
The data warehouse is set up for the benefit of business analysts and executives across all functional areas.
In their individual databases, the different areas may define and store data according to their own version of the “truth”.
When data are retrieved from these different areas and placed in the warehouse, the transformation and cleansing process ensures that there is a single, integrated “truth” at the organizational level.