04-02-2013, 11:30 AM
Cashing in on the Cache in the Cloud
1Cashing in on the Cache.pdf (Size: 2.02 MB / Downloads: 35)
Abstract
Over the past decades, caching has become the key technology used for bridging the performance gap across memory
hierarchies via temporal or spatial localities; in particular, the effect is prominent in disk storage systems. Applications that involve heavy
I/O activities, which are common in the cloud, probably benefit the most from caching. The use of local volatile memory as cache might
be a natural alternative, but many well known restrictions, such as capacity and the utilization of host machines, hinder its effective
use. In addition to technical challenges, providing cache services in clouds encounters a major practical issue (quality of service or
service level agreement issue) of pricing. Currently, (public) cloud users are limited to a small set of uniform and coarse-grained service
offerings, such as High-Memory and High-CPU in Amazon EC2. In this paper, we present the cache as a service (CaaS) model as
an optional service to typical infrastructure service offerings. Specifically, the cloud provider sets aside a large pool of memory that
can be dynamically partitioned and allocated to standard infrastructure services as disk cache. We first investigate the feasibility of
providing CaaS with the proof of concept elastic cache system (using dedicated remote memory servers) built and validated on the
actual system; and practical benefits of CaaS for both users and providers (i.e., performance and profit, respectively) are thoroughly
studied with a novel pricing scheme. Our CaaS model helps to leverage the cloud economy greatly in that (1) the extra user cost for I/O
performance gain is minimal if ever exists, and (2) the provider’s profit increases due to improvements in server consolidation resulting
from that performance gain. Through extensive experiments with eight resource allocation strategies we demonstrate that our CaaS
model can be a promising cost-efficient solution for both users and providers.
INTRODUCTION
The resource abundance (redundancy) in many large datacenters
is increasingly engineered to offer the spare capacity as
a service like electricity, water and gas. For example, public
cloud service providers like Amazon Web Services virtualize
resources, such as processors, storage and network devices,
and offer them as services on demand, i.e., infrastructure as a
service (IaaS) which is the main focus of this paper. A virtual
machine (VM) is a typical instance of IaaS. Although a VM as
an isolated computing platform which is capable of running
multiple applications, it is assumed in this study to be solely
dedicated to a single application; and thus, we use the expressions
VM and application interchangeably hereafter. Cloud
services as virtualized entities are essentially elastic making an
illusion of “unlimited” resource capacity. This elasticity with
utility computing (i.e., pay-as-you-go pricing) inherently brings
cost effectiveness that is the primary driving force behind the
cloud.
BACKGROUND AND RELATED WORK
There have been a number of studies conducted to investigate
the issue of I/O performance in virtualized systems. The focus
of these investigations include I/O virtualization, cache alternatives
and caching mechanisms. In this section, we describe
and discuss notable work related to our study. What primarily
distinguishes ours from previous studies is the practicality with
the virtualization support of remote memory access and the
incorporation of service model; hence, cache as a service.
I/O Virtualization
Virtualization enables resources in physical machines to be
multiplexed and isolated for hosting multiple guest OSes
(VMs). In virtualized environments, I/O between a guest
OS and a hardware device should be coordinated in a safe
and efficient manner. However, I/O virtualization is one of
the severe software obstacles that VMs encounter due to its
performance overhead. Menon et al. [6] tackled virtualized I/O
by performing full functional breakdown with their profiling
tools.
Several studies [7], [8], [9] contribute to the efforts narrowing
the gap between virtual and native performance. Cherkasova
et al. [7] and Menon et al. [6] studied I/O performance in the
Xen hypervisor [10] and showed a significant I/O overhead
in Xen’s zero-copy with the page-flipping technique. They
proposed that page-flipping be simply replaced by the memcpy
function to avoid side-effects. Menon et al. [9] optimized I/O
performance by introducing virtual machine monitor (VMM)
superpage and global page mappings. Liu et al. [8] proposed a
new device virtualization called VMM-bypass that eliminates
data transfer between the guest OS and the hypervisor by
giving the guest device driver direct access to the device.
Cache Device
Cooperative cache [2] is a kind of RM cache that improves
the performance of networked file systems. In particular, it is
adopted in the Serverless Network File System [3]. It uses participating
clients’ memory regions as a cache. A remote cache
is placed between the memory-based cache of a requesting
client and a server disk. Each participating client exchanges
meta information for the cache with others periodically. Such
a caching scheme is effective where a RM is faster than a local
disk of the requesting client. Jiang et al. [4] propose advanced
buffer management techniques for cooperative cache. These
techniques are based on the degree of locality. Data that have
high (low) locality scores are placed on a high-level (low-level)
cache. Kim et al. [5] propose a cooperative caching system
that is implemented at the virtualization layer, and the system
reduces disk I/O operations for shared working sets of virtual
machines.
CACHE AS A SERVICE: OVERVIEW
The CaaS model consists of two main components: an elastic
cache system as the architectural foundation and a service
model with a pricing scheme as the economic foundation.
The basic system architecture for the elastic cache aims to
use RM, which is exported from dedicated memory servers (or
possibly SSDs). It is not a new caching algorithm. The elastic
cache system can use any of the existing cache replacement
algorithms. Near uniform access time to RM-based cache is
guaranteed by a modern high speed network interface that
supports RDMA as primitive operations. Each VM in the
cloud accesses the RM servers via the access interface that is
implemented and recognized as a normal block device driver.
Based on this access layer, VMs utilize RM to provision a
necessary amount of cache memory on demand.
As shown in Figure 1, a group of dedicated memory servers
exports their local memory to VMs, and exported memory
space can be viewed as an available memory pool. This memory
pool is used as an elastic cache for VMs in the cloud. For
billing purposes, cloud service providers could employ a lease
mechanism to manage the RM pool.
EVALUATION
In this section, we evaluate CaaS from the viewpoints of
both users and providers. To this end, we first measure the
performance benefit of our elastic cache system- in terms of
performance (e.g., transactions per minute), cache hit ratio and
reliability. The actual system level modification for our system
is not possible with the existing cloud providers like Amazon
and Microsoft. We can neither dedicate physical servers of the
cloud providers to RM servers nor assign SSDs and RDMA
devices to physical servers. Owing to these issues we could not
test our systems on real cloud services but we built an RDMAand
SSD-enabled cloud infrastructure (Figure 4) to evaluate
our systems. We then simulate a large-scale cloud environment
with more realistic settings for resources and user requests.
This simulation study enables us to examine the cost efficiency
of CaaS. While experimental results in Section 6.1 demonstrate
the feasibility of our elastic cache system those in Section 6.2
confirm the practicality of CaaS (or applicability of CaaS to the
cloud).
CONCLUSION
With the increasing popularity of infrastructure services such
as Amazon EC2 and Amazon RDS, low disk I/O performance
is one of most significant problems. In this paper, we have
presented a CaaS model as a cost efficient cache solution to
mitigate the disk I/O problem in IaaS. To this end, we have
built a prototype elastic cache system using a remote-memorybased
cache, which is pluggable and file-system independent
to support various configurations. This elastic cache system
together with the pricing model devised in this study has
validated the feasibility and practicality of our CaaS model.
Through extensive experiments we have confirmed that CaaS
helps IaaS improve disk I/O performance greatly.