01-07-2013, 04:13 PM
Complex ETL Testing: A Strategic Approach
Complex ETL.pdf (Size: 711.57 KB / Downloads: 90)
Data Warehouse Testing
A data warehouse is a repository of transactional data that has been extracted from original electronic sources and transformed so that query, analysis and reporting on trends within historic data are possible and efficient. The analyses provided by data warehouses may involve strategic planning, decision support, and monitoring the outcomes of a chosen strategy. Typically, data that is loaded into a data warehouse is derived from diverse sources of operational data, which may consist of data from databases, feeds, application files or flat files. The data must be extracted from these diverse sources, transformed to a common format, and loaded into the data warehouse. Typically, it is further aggregated into a data mart for efficient reporting. The ETL (Extract, transform and load) process is a critical step in any data warehouse implementation, and continues to be an area of major significance whenever the ETL code is updated. Once the data warehouse and data marts are populated, business intelligence applications facilitate querying, analysis and reporting. The business intelligence tools may provide simple presentations of data based on queries, or may support sophisticated statistical analysis options. Data warehouses may have multiple front-end applications, depending on the needs of the user community. Figure 1 shows a simplified data warehouse scheme.
Data Verification
The recommended pre-deployment strategy is to build test automation (both functional and performance) for every test entry point in the system (feeds, databases, internal messaging, front-end transactions). The goal of the strategy is to provide automated tools for rapid localization of issues between test entry points (see Figure 1).
In Figure 1, data from a variety of sources is transformed by the ETL into the Data Warehouse. A second ETL “leg” aggregates data from the Data Warehouse into Datamart tables for efficient reporting. Front-end applications and Business Intelligence applications access the Datamart in order to provide historical and statistical analyses of company data.