14-11-2012, 01:16 PM
CloudTPS: Scalable Transactions for Web Applications in the Cloud
CloudTPS Scalable Transactions for Web Applications in the Cloud.pdf (Size: 595.54 KB / Downloads: 40)
INTRODUCTION
CLOUD computing offers the vision of a virtually infinite
pool of computing, storage and networking resources
where applications can be scalably deployed [1].
In particular, NoSQL cloud database services such as
Amazon’s SimpleDB [2] and Google’s Bigtable [3] offer
a scalable data tier for applications in the cloud. These
systems typically partition the application data to provide
incremental scalability, and replicate the partitioned
data to tolerate server failures.
The scalability and high availability properties of
Cloud platforms however come at a cost. First, these
scalable database services allow data query only by
primary key rather than supporting secondary-key or
join queries. Second, these services provide only weak
consistency such as eventual data consistency: any data
update becomes visible after a finite but undeterministic
amount of time. As weak as this consistency property
may seem, it does allow to build a wide range of useful
applications, as demonstrated by the commercial success
of Cloud computing platforms. However, many other
applications such as payment and online auction services
cannot afford any data inconsistency. While primarykey-
only data access is a relatively minor inconvenience
that can often be accommodated by good data structures,
it is essential to provide transactional data consistency to
support the applications that need it.
RELATED WORK
Data Storage in the Cloud
The simplest way to store structured data in the cloud is
to deploy a relational database such as MySQL or Oracle.
1. Our prototype is available at
IEEE TRANSACTIONS ON SERVICES COMPUTING, SPECIAL ISSUE ON CLOUD COMPUTING, 2011 3
The relational data model, typically implemented via
the SQL language, provides great flexibility in accessing
data. It supports sophisticated data access operations
such as aggregation, range queries, join queries, etc.
RDBMSs support transactions and guarantee strong data
consistency. One can easily deploy a classical RDBMS
in the cloud and thus get support for transactional
consistency. However, the flexible query language and
strong data consistency prevent one from partitioning
data automatically, which is the key for performance
scalability. These systems rely on replication techniques
and therefore do not bring extra scalability improvement
compared to a non-cloud deployment [15], [16].
SYSTEM MODEL
the organization of CloudTPS. Clients
issue HTTP requests to a Web application, which in turn
issues transactions to a Transaction Processing System
(TPS). The TPS is composed of any number of LTMs,
each of which is responsible for a subset of all data
items. The Web application can submit a transaction
to any LTM that is responsible for one of the accessed
data items. This LTM then acts as the coordinator of the
transaction across all LTMs in charge of the data items
accessed by the transaction. The LTMs operate on an inmemory
copy of the data items loaded from the cloud
storage service. Data updates resulting from transactions
are kept in memory of the LTMs. To prevent data loss
due to LTM server failures, the data updates are replicated
to multiple LTM servers. LTMs also periodically
checkpoint the updates back to the cloud storage service
which is assumed to be highly-available and persistent.
We implement transactions using the 2-Phase Commit
protocol (2PC). In the first phase, the coordinator requests
all involved LTMs and asks them to check that the
operation can indeed been executed correctly. If all LTMs
vote favorably, then the second phase actually commits
the transaction. Otherwise, the transaction is aborted.
SYSTEM DESIGN
We now detail the design of the TPS to guarantee the
Atomicity, Consistency, Isolation and Durability properties.
Each of the properties is discussed individually. We
then discuss the membership mechanisms to guarantee
the ACID properties even in case of LTM failures and
network partitions.
Atomicity
The Atomicity property requires that either all operations
of a transaction complete successfully, or none of
them does. To ensure Atomicity, for each transaction
issued, CloudTPS performs two-phase commit (2PC)
across all the LTMs responsible for the data items
accessed. As soon as an agreement to “COMMIT” is
reached, the transaction coordinator can simultaneously
return the result to the web application and complete
the second phase [40].
To ensure Atomicity in the presence of server failures,
all transaction states and data items should be replicated
to one or more LTMs. LTMs replicate the data items to
the backup LTMs during the second phase of transaction.
Thus when the second phase completes successfully,
all replicas of the accessed data items are consistent.
The transaction state includes the transaction timestamp
(discussed in Section 4.3), the agreement to “COMMIT,”
and the list of data updates to be committed.