31-07-2013, 12:48 PM
Implementation of Heterogeneous and Homogenous Distributed Databases
Implementation of Heterogeneous .docx (Size: 114.98 KB / Downloads: 32)
Abstract
A homogenous distributed real time replicated database system is a network of two or more DBMS that reside on one or more machines. A distributed system that connects two or more databases are Homogenous Distributed Database Systems (HDDBS) create different problems when accessing distributed and replicated databases. Particularly, access control and transaction management in HDDBS require different mechanism to monitor data retrieval and update to databases. Current trends in multi-tier client/server networks make DDBS an appropriated solution to provide access to and control over localized databases. A common problem within most large corporations nowadays is the diversity of database systems that are employed by their many departments in the development of a product. Usually, the total corporate data resource is characterized by multi-vendor database servers which, unfortunately, have no ability to relate data from heterogeneous data sources. A database access interface which allows users to formulate SQL2 queries in a homogeneous way against a federation of heterogeneous databases. The database heterogeneity is not only completely hidden from the user, but what the user really perceives is a global database schema which can be queried as though all data reside in a single local database when, in fact, most of the data are distributed over heterogeneous, autonomous, and remote data sources. Further, the users can navigate through the database complex and compare, join, and relate information via a single graphic interface
INTRODUCTION
A real-time system is one that must process information and produce a response within a specified time, else risk severe consequences, including failure. That is, in a system with a real-time constraint it is no good to have the correct action or the correct answer after a certain deadline: it is either by the deadline or it is useless. Database replication based on group communication systems has been proposed as an efficient and flexible solution for data replication. Distributed data base system is a technique that is used to solve a single problem in a heterogeneous computer network system. A major issue in building a distributed database system is the transactions atomicity. When a transaction runs across into two sites, it may happen that one site may commit and other one may fail due to an inconsistent state of transaction. Two-phase commit protocol is widely used to solve these problems. The choice of commit protocol is an important design decision for distributed database system. A commit protocol in a distributed database transaction should uniformly commit to ensure that all the participating sites agree to the final outcome and the result may be either a commit or an abort situation. Many real time database applications are distributed in nature [1,11,12] These include the aircraft control, stock trading, network management, factory automation etc . The real time performance of Real time distributed database system (RTDBS) depends on several factors such as the database system architecture, the underlying processor, disk speed etc.
DISTRIBUTED DATABASE SYSTEMS (DDBS)
Distributed database systems (DDBS) are systems that have their data distributed and replicated over several locations; unlike the centralized data base system (CDBS), where one copy of the data is stored. Data may be replicated over a network using horizontal and vertical fragmentation similar to projection and selection operations in Structured Query Language (SQL). Both types of database share the same problems of access control and transaction management, such as user concurrent access control and deadlock detection and resolution. On the other hand however, DDBS must also cope with different problems. Access control and transaction management in DDBS require different rules to monitor data retrieval and update to distributed and replicated databases[2,7].Oracle, as a leading Database Management Systems (DBMS) employs the two-phase commit technique to maintain a consistent state for the databases. The objective of this paper is to explain transaction management in DDBMS and how to implements this technique. To assist in understanding this process, an example is given in the last section. It is hoped that this understanding will encourage organizations to use and academics to discuss DDBS and to successfully capitalize on this feature of Database. The next section presents advantages, disadvantages, and failures in Distributed Database Systems. Subsequent sections provide discussions on the fundamentals of transaction management, two-phase commit, homogenous distributed database system implementation of the two-phase commit, and, finally, an example on how the two phases commit works.
Advantages of Distributed Database system(DDBS)
Since organizations tend to be geographically dispersed, a DDBS fits the organizational structure better than traditional centralized DBS. Improved Availability-A failure does not make the entire system inoperable and Improved Reliability-Data may be replicatedEach location will have its local data as well as the ability to get needed data from other locations via a communication network. Moreover, the failure of one of the servers at one site won‘t render the distributed database system inaccessible. The affected site will be the only one directly involved with that failed server. In addition, if any data is required from a site exhibiting a failure, such data may be retrieved from other locations containing the replicated data. The performance of the system will improve, since several machines take care of distributing the load of the CPU and the I/O. Also, the expansion of the distributed system is relatively easy, since adding a new location doesn‘t affect the existing ones.
Homogeneous DDBMS
In a homogeneous distributed database all sites have identical software and are aware of each other and agree to cooperate in processing user requests. Each site surrenders part of its autonomy in terms of right to change schema or software. Homogeneous DDBMS appears to user as a single system. The homogeneous system is much easier to design and manage. The following conditions must be satisfied for homogeneous database. The operating system used, at each location must be same or compatible. The data structures used at each location must be same or compatible. The database application (or DBMS) used at each location must be same or compatible.
A homogenous distributed database system is a network of two or more Oracle databases that reside on one or more machines. In figure, a distributed system that connects three databases: HQ, MFG, and SALES. An application can simultaneously access or modify the data in several databases in a single distributed environment. For example, a single query from a Manufacturing client on local database MFG can retrieve joined data from the PRODUCTS table on the local database and the DEPT table on the remote HQ database.
HETEROGENEOUS DISTRIBUTEDDATABASE CAPABILITIES
Different types of capabilities can be provided by heterogeneous distributed database systems. They include schema integration, distributed query processing, distributed transaction management, administrative functions, and coping with different types of heterogeneity. Schema integration has to do with the way in which users can logically view the distributed data. Distributed query management deals with the analysis, optimization, and execution of queries that reference distributed data. Distributed transaction management deals with the atomicity, isolation, and durability of transactions in a distributed system. Administrative functions include such things as authentication and authorization, defining and enforcing semantic constraints on the data, and management of data dictionaries and directories. Heterogeneity can include differences in hardware, operating systems, communications links, database management system (DBMS) vendors, and/or data models. These are all important aspects of distributed
data management. In considering them, it is important to recognize there is no “ideal” set of capabilities for all environments or applications. A particular capability may be invaluable in certain situations while being totally unsuitable in others.
TWO-PHASE COMMIT PROTOCOL
The 2-phase commit (2PC) protocol is a distributed algorithm to ensure the consistent termination of a transaction in a distributed environment. Thus, via 2PC an unanimous decision is reached and enforced among multiple participating servers whether to commit or abort a given transaction, thereby guaranteeing atomicity. The protocol proceeds in two phases, namely the prepare and the commit phase, which explains the protocol‘s name. The protocol is executed by a coordinator process, while the participating servers are called participants. When the transaction‘s initiator issues a request to commit the transaction, the coordinator starts the first phase of the 2PC protocol by querying—via prepare messages—all participants whether to abort or to commit the transaction The master initiates the first phase of the protocol by sending PREPARE (to commit) messages in parallel to all the cohorts. Each cohort that is ready to commit first force-writes a prepare log record to its local stable storage and then sends a YES vote to the master. At this stage, the cohort has entered a prepared state wherein it cannot unilaterally commit or abort the transaction but has to wait for the final decision from the master. On the other hand, each cohort that decides to abort force-writes an abort log record and sends a NO vote to the master. Since a NO vote acts like a veto, the cohort is permitted to unilaterally abort the transaction without waiting for a response from the master. After the master receives the votes from all the cohorts, it initiates the second phase of the protocol. If all the votes are YES, it moves to a committing state by force writing a commit log record and sending COMMIT messages to all the cohorts.
CONCLUSIONS
Scheduled the present time Transaction management is an fully grown thought in distributed data base management systems (DDBMS) for research area. Our Homogenous Distributed Database Systems based replication proposal is able to inherit and reduces the communication traffic the best characteristics of the Database Systems. However, Oracle was the first commercial DBMS to implement a method of transaction management: the two-phase commit. Though it was very difficult to obtain in order on homogenous DBMS implementation of this method were able to pull together sufficient in sequence to put in writing homogenous transaction for the database system. Many associations do not implement distributed databases because of its difficulty. They simply resort to centralized databases. However, with global organizations and multi-tier network architectures, distributed implementation becomes a necessity. It is hoped that this paper to will assist organization in the implementation of distributed databases when installing homogenous DBMS, or give confidence organizations to journey from centralized to distributed DBMS. Organisations could also contribute to this process by having graduates with the knowledge of homogeneous DBMS capability. With DBMS making so much effort on incorporating this and other advanced features in its database software, academicians should also play a major role in exposing beneficiary to these superior element transaction management.