09-08-2012, 04:49 PM
Going Back and Forth: Efficient Multideployment and Multisnapshotting on Clouds
Going Back and Forth.pdf (Size: 284.88 KB / Downloads: 26)
INTRODUCTION
In recent years, Infrastructure as a Service (IaaS) cloud
computing [30] has emerged as a viable alternative to the
acquisition and management of physical resources. With
IaaS, users can lease storage and computation time from
large datacenters. Leasing of computation time is accom-
plished by allowing users to deploy virtual machines (VMs)
on the datacenter’s resources. Since the user has complete
control over the configuration of the VMs using on-demand
deployments [5, 16], IaaS leasing is equivalent to purchasing
dedicated hardware but without the long-term commitment
and cost. The on-demand nature of IaaS is critical to mak-
ing such leases attractive, since it enables users to expand
or shrink their resources according to their computational
needs, by using external resources to complement their local
resource base [20].
Application state
The state of the VM deployment is defined at each mo-
ment in time by two main components: the state of each
of the VM instances and the state of the communication
channels between them (opened sockets, in-transit network
packets, virtual topology, etc.).
Thus, in the most general case (Model 1 ), saving the appli-
cation state implies saving both the state of all VM instances
and the state of all active communication channels among
them. While several methods have been established in the
virtualization community to capture the state of a running
VM (CPU registers, RAM, state of devices, etc.), the issue
of capturing the global state of the communication channels
is difficult and still an open problem [19].
OUR APPROACH
We propose a virtual file system aimed at optimizing the
multi-deployment and multi-snapshotting patterns based on
the observations presented in Section 2.
Design overview
We rely on four key principles: aggregate the storage
space, optimize VM disk access, reduce contention, and op-
timize multisnapshotting.
Aggregate the storage space locally available
on the compute nodes
In most cloud deployments [5, 3, 4], the disks locally at-
tached to the compute nodes are not exploited to their full
potential. Most of the time, such disks are used to hold local
copies of the images corresponding to the running VMs, as
well as to provide temporary storage for them during their
execution, which utilizes only a small fraction of the total
disk size.
Applicability in the cloud
The simplified architecture of a cloud that integrates our
approach is depicted in Figure 1. The typical elements found
in the cloud are illustrated with a light background, while
the elements that are part of our proposal are highlighted by
a darker background. A distributed versioning storage ser-
vice that supports cloning and shadowing is deployed on the
compute nodes and consolidates parts of their local disks
into a common storage pool. The cloud client has direct
access to the storage service and is allowed to upload and
download images from it. Every uploaded image is automat-
ically striped. Furthermore, the cloud client interacts with
the cloud middleware through a control API that enables a
variety of management tasks, including deploying an image
on a set of compute nodes, dynamically adding or removing
compute nodes from that set, and snapshotting individual
VM instances or the whole set. The cloud middleware in
turn coordinates the compute nodes to achieve the afore-
mentioned management tasks. Each compute node runs a
hypervisor that is responsible for running the VMs. The
reads and writes of the hypervisor are trapped by the mir-
roring module, which is responsible for on-demand mirroring
and snapshotting (as explained in Section 3.1) and relies on
both the local disk and the distributed versioning storage
service to do so.
Zoom on mirroring
One important aspect of on-demand mirroring is the de-
cision of how much to read from the repository when data is
unavailable locally, in such way as to obtain a good access
performance.
A straightforward approach is to translate every read is-
sued by the hypervisor in either a local or remote read, de-
pending on whether the requested content is locally avail-
able. While this approach works, its performance is ques-
tionable. More specifically, many small remote read requests
to the same chunk generate significant network traffic over-
head (because of the extra networking information encapsu-
lated with each request), as well as low throughput (because
of the latencies of the requests that add up).
Moreover, in the case of many scattered small writes, a
lot of small fragments need to be accounted for, in order to
remember what is available locally for reading and what is
not. Fragmentation is costly in this case and incurs a sig-
nificant management overhead, negatively impacting access
performance.