12-09-2017, 12:23 PM
In computing, a data warehouse (DW or DWH), also known as a company data warehouse (EDW), is a system used for reporting and data analysis, and is considered a basic component of business intelligence . DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data in one place and are used to create analytical reports for knowledge workers across the enterprise.
Data stored in the warehouse is loaded from operating systems (such as marketing or sales). The data can be passed through an operational data warehouse and may require data cleaning for additional operations to ensure the quality of the data before being used in the DW for reporting purposes.
The typical data warehouse based on extraction, transformation, load (ETL) uses layers of staging, data integration and access to host its key functions. The staging layer or the staging database stores the raw data extracted from each of the disparate source data systems. The integration layer integrates the disparate data sets by transforming the data from the buffer layer by often storing this transformed data into an operational data store (ODS) database. Integrated data is transferred to another database, often called a data warehouse database, where data is organized into hierarchical groups, often called dimensions, and in aggregate facts and facts. The combination of facts and dimensions is sometimes called a star schema. The access layer helps users recover data.
The main source of data is cleaned, transformed, catalouged and made available to managers and other business professionals for data mining, online analytical processing, market research and decision support. However, the means to retrieve and analyze data, extract, transform, and load data and manage the data dictionary are also considered essential components of a data storage system. Many references to data warehousing use this broader context. Therefore, an expanded definition of data warehousing includes business intelligence tools, tools for extracting, transforming and loading data into the repository, and tools for managing and retrieving metadata.
Data stored in the warehouse is loaded from operating systems (such as marketing or sales). The data can be passed through an operational data warehouse and may require data cleaning for additional operations to ensure the quality of the data before being used in the DW for reporting purposes.
The typical data warehouse based on extraction, transformation, load (ETL) uses layers of staging, data integration and access to host its key functions. The staging layer or the staging database stores the raw data extracted from each of the disparate source data systems. The integration layer integrates the disparate data sets by transforming the data from the buffer layer by often storing this transformed data into an operational data store (ODS) database. Integrated data is transferred to another database, often called a data warehouse database, where data is organized into hierarchical groups, often called dimensions, and in aggregate facts and facts. The combination of facts and dimensions is sometimes called a star schema. The access layer helps users recover data.
The main source of data is cleaned, transformed, catalouged and made available to managers and other business professionals for data mining, online analytical processing, market research and decision support. However, the means to retrieve and analyze data, extract, transform, and load data and manage the data dictionary are also considered essential components of a data storage system. Many references to data warehousing use this broader context. Therefore, an expanded definition of data warehousing includes business intelligence tools, tools for extracting, transforming and loading data into the repository, and tools for managing and retrieving metadata.