15-06-2012, 04:50 PM
DATABASE AND DATA WAREHOUSE
DATABASE AND DATA WAREHOUSE .docx (Size: 82.01 KB / Downloads: 33)
INTRODUCTION
Today organizations recognize the significant advantages and value that data warehousing can provide both for pure analysis and as a complement to operational systems. While data warehouses exist in many forms, including enterprise-scale centralized monoliths, dependent and independent data marts, operational data stores, they all benefit from complete, consistent, and accurate data. While an organization’s overall data warehouse architecture can encompass a variety of forms, each organization must decide what is right for its own purposes and recognize that implementing a successful data warehousing environment is a continuous journey, not a one-time event. Whatever the choice, two things are certain: data integration and data quality will be key components of, If not the enabling technology for, the organization’s data warehousing success. Data integration is an ongoing process that comes into play with each data load and with each subject area extension; the quality of the data in the warehouse must be continually monitored to ensure its accuracy. Organizations that ignore these requirements must be careful that instead of building a data warehouse that will be of benefit to their users, they do not inadvertently wind up creating a repository that provides suboptimal business value. Many data warehousing industry vendors can provide robust data integration and data quality solutions. In addition to developing and marketing products, these vendors offer a wealth of experience and expertise that they can share with their customers. As a result, an organization is best served when it deploys a commercial, fully supported and maintained set of tools rather than trying to develop and maintain such a technology on its own.
OVERVIEW OF DATABASE
What’s a database? We can use pretty much anything as a database, as long as it allows us to store our data and retrieve it later. There are many different kinds of databases. Some allow us to store data and retrieve it years later; others are capable of preserving data only while there is an electricity supply. Some databases are designed for fast searches, others for fast insertions. Some databases are very easy to use, while some are very complicated (you may even have to learn a whole language to know how to operate them). There are also large price differences.
A database is a means of storing information in such a way that information can be retrieved from it. In simplest terms, a relational database is one that presents information in tables with rows and columns. A table is referred to as a relation in the sense that it is a collection of objects of the same type (rows). Data in a table can be related according to common keys or concepts, and the ability to retrieve related data from a table is the basis for the term relational database. A Database Management System (DBMS) handles the way data is stored, maintained, and retrieved. In the case of a relational database, a Relational Database Management System (RDBMS) performs these tasks.
TYPES OF DATABASE
Databases can be of two types:
Volatile database
Non-volatile database
VOLATILE DATABASE
We use volatile databases all the time, even if we don’t think about them as real databases. These databases are usually just part of the programs we run.
NON-VOLATILE DATABASE
Some information is so important that you cannot afford to lose it. Consider the name and password for authenticating users. If a person registers at a site that charges a subscription fee, it would be unfortunate if his subscription details were lost the next time the web server was restarted. In this case, the information must be stored in a non-volatile way, and that usually means on disk. Several options are available, ranging from flat files to DBM files to fully-fledged relational databases. Which one you choose will depend on a number of factors, including:
• The size of each record and the volume of the data to be stored
• The number of concurrent accesses (to the server or even to the same data)
• Data complexity (do all the records fit into one row, or are there relations between different kinds of record?)
• Budget (some database implementations are great but very expensive)
• Failover and backup strategies (how important it is to avoid downtime, how soon the data must be restored in the case of a system failure)
WHAT IS A DATA WAREHOUSE
Many organizations have success fully implemented data warehouses to analyze the data contained in their multiple operational systems to compare current and historical values. By doing so, they can better, and more profitably, manage their business, analyze past efforts, and plan for the future. When properly deployed, data warehouses benefit the organization by significantly enhancing its decision-making capabilities, thus improving both its efficiency and effectiveness. However, the quality of the decisions that are facilitated by a data warehouse is only as good as the quality of the data contained in the data warehouse – this data must be accurate, consistent, and complete. For example, in order to determine its top ten customers, an organization must be able to aggregate sales across all of its sales channels and business units and recognize when the same customer is identified by multiple names, addresses, or customer numbers. In other words, the data used to determine the top ten customers must be integrated and of high quality. After all, if the data is incomplete or incorrect then so will be the results of any analysis performed upon it.