24-09-2013, 04:00 PM
Going Back and Forth: Efficient Multideployment and Multisnapshotting on Clouds
Going Back and Forth.pdf (Size: 249.27 KB / Downloads: 75)
ABSTRACT
Infrastructure as a Service (IaaS) cloud computing has rev-
olutionized the way we think of acquiring resources by in-
troducing a simple change: allowing users to lease compu-
tational resources from the cloud provider’s datacenter for a
short time by deploying virtual machines (VMs) on these re-
sources. This new model raises new challenges in the design
and development of IaaS middleware. One of those chal-
lenges is the need to deploy a large number (hundreds or
even thousands) of VM instances simultaneously. Once the
VM instances are deployed, another challenge is to simulta-
neously take a snapshot of many images and transfer them
to persistent storage to support management tasks, such as
suspend-resume and migration. With datacenters growing
rapidly and configurations becoming heterogeneous, it is im-
portant to enable efficient concurrent deployment and snap-
shotting that are at the same time hypervisor independent
and ensure a maximum compatibility with different configu-
rations. This paper addresses these challenges by proposing
a virtual file system specifically optimized for virtual ma-
chine image storage. It is based on a lazy transfer scheme
coupled with object versioning that handles snapshotting
transparently in a hypervisor-independent fashion, ensuring
high portability for different configurations. Large-scale ex-
periments on hundreds of nodes demonstrate excellent per-
formance results: speedup for concurrent VM deployments
ranges from a factor of 2 up to 25, with a reduction in band-
width utilization of as much as 90%.
INTRODUCTION
In recent years, Infrastructure as a Service (IaaS) cloud
computing [30] has emerged as a viable alternative to the
acquisition and management of physical resources. With
IaaS, users can lease storage and computation time from
large datacenters. Leasing of computation time is accom-
plished by allowing users to deploy virtual machines (VMs)
on the datacenter’s resources. Since the user has complete
control over the configuration of the VMs using on-demand
deployments [5, 16], IaaS leasing is equivalent to purchasing
dedicated hardware but without the long-term commitment
and cost. The on-demand nature of IaaS is critical to mak-
ing such leases attractive, since it enables users to expand
or shrink their resources according to their computational
needs, by using external resources to complement their local
resource base [20].
This emerging model leads to new challenges relating to
the design and development of IaaS systems. One of the
commonly occurring patterns in the operation of IaaS is the
need to deploy a large number of VMs on many nodes of a
datacenter at the same time, starting from a set of VM im-
ages previously stored in a persistent fashion. For example,
this pattern occurs when the user wants to deploy a virtual
cluster that executes a distributed application or a set of en-
vironments to support a workflow. We refer to this pattern
as multideployment.
Applicability in the cloud
The simplified architecture of a cloud that integrates our
approach is depicted in Figure 1. The typical elements found
in the cloud are illustrated with a light background, while
the elements that are part of our proposal are highlighted by
a darker background. A distributed versioning storage ser-
vice that supports cloning and shadowing is deployed on the
compute nodes and consolidates parts of their local disks
into a common storage pool. The cloud client has direct
access to the storage service and is allowed to upload and
download images from it. Every uploaded image is automat-
ically striped. Furthermore, the cloud client interacts with
the cloud middleware through a control API that enables a
variety of management tasks, including deploying an image
on a set of compute nodes, dynamically adding or removing
compute nodes from that set, and snapshotting individual
VM instances or the whole set. The cloud middleware in
turn coordinates the compute nodes to achieve the afore-
mentioned management tasks. Each compute node runs a
hypervisor that is responsible for running the VMs. The
reads and writes of the hypervisor are trapped by the mir-
roring module, which is responsible for on-demand mirroring
and snapshotting (as explained in Section 3.1) and relies on
both the local disk and the distributed versioning storage
service to do so.
Zoom on mirroring
One important aspect of on-demand mirroring is the de-
cision of how much to read from the repository when data is
unavailable locally, in such way as to obtain a good access
performance.
A straightforward approach is to translate every read is-
sued by the hypervisor in either a local or remote read, de-
pending on whether the requested content is locally avail-
able. While this approach works, its performance is ques-
tionable. More specifically, many small remote read requests
to the same chunk generate significant network traffic over-
head (because of the extra networking information encapsu-
lated with each request), as well as low throughput (because
of the latencies of the requests that add up).
Moreover, in the case of many scattered small writes, a
lot of small fragments need to be accounted for, in order to
remember what is available locally for reading and what is
not. Fragmentation is costly in this case and incurs a sig-
nificant management overhead, negatively impacting access
performance.
RELATED WORK
Multideployment that relies on full broadcast-based pre-
propagation is a widely used technique [28, 31, 14]. While
this technique avoids read contention to the repository, it
can incur a high overhead in both network traffic and ex-
ecution time, as presented in Section 5.2. Furthermore,
since the VM images are fully copied locally on the compute
nodes, multisnapshotting becomes infeasible: large amounts
of data are unnecessarily duplicated and cause unacceptable
transfer delays, not to mention huge storage space and net-
work traffic utilization.
CONCLUSIONS
As cloud computing becomes increasingly popular, effi-
cient management of VM images, such as image propaga-
tion to compute nodes and image snapshotting for check-
pointing or migration, is critical. The performance of these
operations directly affects the usability of the benefits of-
fered by cloud computing systems.