06-02-2013, 02:38 PM
Gossip-based Resource Management for Cloud Environments
Gossip-based Resource.pdf (Size: 500.04 KB / Downloads: 102)
Abstract—
We address the problem of resource management
for a large-scale cloud environment that hosts sites.
Our contribution centers around outlining a distributed
middleware architecture and presenting one of its key
elements, a gossip protocol that meets our design goals:
fairness of resource allocation with respect to hosted sites,
efficient adaptation to load changes and scalability in terms
of both the number of machines and sites. We formalize
the resource allocation problem as that of dynamically
maximizing the cloud utility under CPU and memory
constraints. While we can show that an optimal solution
without considering memory constraints is straightforward
(but not useful), we provide an efficient heuristic solution
for the complete problem instead. We evaluate the protocol
through simulation and find its performance to be wellaligned
with our design goals.
Index Terms—cloud computing, distributed management,
resource allocation, gossip protocols
I. INTRODUCTION
We consider the problem of resource management
for a large-scale cloud environment. Such an environment
includes the physical infrastructure and associated
control functionality that enables the provisioning and
management of cloud services. The perspective we take
is that of a cloud service provider, which hosts sites in
a cloud environment.
The stakeholders are depicted in Figure 1a. The cloud
service provider owns and administrates the physical
infrastructure, on which cloud services are provided. It
offers hosting services to site owners through a middleware
that executes on its infrastructure (See Figure 1b).
Site owners provide services to their respective users via
sites that are hosted by the cloud service provider.
SYSTEM ARCHITECTURE
A cloud environment spans several datacenters interconnected
by an internet. Each of these datacenters
contains a large number of machines that are connected
by a high-speed network. Users access sites hosted by
the cloud environment through the public Internet. A site
is typically accessed through a URL that is translated to a
network address through a global directory service, such
as DNS. A request to a site is routed through the Internet
to a machine inside a datacenter that either processes
the request or forwards it. In this paper, we restrict
ourselves to a cloud that spans a datacenter containing a
single cluster of machines and leave for further work
the extension of our contribution to an environment
including multiple datacenters.
Figure 2 (left) shows the architecture of the cloud
middleware. The components of the middleware layer
run on all machines. The resources of the cloud are
primarily consumed by module instances whereby the
functionality of a site is made up of one or more
modules. In the middleware, a module either contains
part of the service logic of a site (denoted by mi in
Figure 2) or a site manager (denoted by SMi).
EVALUATION THROUGH SIMULATION
We have evaluated P through extensive simulations
using a discrete event simulator that we developed inhouse.
We simulate a distributed system that runs the
machine manager components of all machines in the
cloud. Specifically, these machine managers execute the
protocol P, which computes the allocation matrix A, and
also the CYCLON protocol, which provides for P the
function of selecting a random neighbor. The external
events for this simulation are the changes in demand
vector !.
DISCUSSION AND CONCLUSION
With this paper, we make a significant contribution
towards engineering a resource management middleware
for a site-hosting cloud environment. We identify a
key component of such a middleware and present a
protocol that can be used to meet our design goals
for resource management: fairness of resource allocation
with respect to sites, efficient adaptation to load changes
and scalability of the middleware layer in terms of both
the number of machines in the cloud as well as the
number of hosted sites.