Exploiting Dynamic Resource Allocation for Efficient Parallel Data Processing

**seminar ideas** · 05-05-2012, 12:11 PM

Exploiting Dynamic Resource Allocation for Efficient Parallel Data Processing in the Cloud

.pdf

cloud.pdf (Size: 551.14 KB / Downloads: 35)
INTRODUCTION

Today a growing number of companies have to process
huge amounts of data in a cost-efficient manner. Classic
representatives for these companies are operators of
Internet search engines, like Google, Yahoo, or Microsoft.
The vast amount of data they have to deal with every
day has made traditional database solutions prohibitively
expensive [5]. Instead, these companies have
popularized an architectural paradigm based on a large
number of commodity servers. Problems like processing
crawled documents or regenerating a web index are split
into several independent subtasks, distributed among
the available nodes, and computed in parallel.

CHALLENGES AND OPPORTUNITIES

Current data processing frameworks like Google’s
MapReduce or Microsoft’s Dryad engine have been designed
for cluster environments. This is reflected in a
number of assumptions they make which are not necessarily
valid in cloud environments. In this section we
discuss how abandoning these assumptions raises new
opportunities but also challenges for efficient parallel
data processing in clouds.

Opportunities

Today’s processing frameworks typically assume the resources
they manage consist of a static set of homogeneous
compute nodes. Although designed to deal with individual
nodes failures, they consider the number of available
machines to be constant, especially when scheduling
the processing job’s execution. While IaaS clouds can
certainly be used to create such cluster-like setups, much
of their flexibility remains unused.

Challenges
The cloud’s virtualized nature helps to enable promising
new use cases for efficient parallel data processing. However,
it also imposes new challenges compared to classic
cluster setups. The major challenge we see is the cloud’s
opaqueness with prospect to exploiting data locality:

DESIGN
Based on the challenges and opportunities outlined
in the previous section we have designed Nephele, a
new data processing framework for cloud environments.
Nephele takes up many ideas of previous processing
frameworks but refines them to better match the dynamic
and opaque nature of a cloud.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Ranked, Efficient and Secure Keyword search over encrypted cloud data PPT	seminar post	1	814	21-09-2017, 11:55 AM Last Post: jaseela123
	Data Mining: What is Data Mining? Report	project girl	1	2,262	21-09-2017, 11:47 AM Last Post: jaseela123
	DEMONSTRATING DATAPOSSESSION AND UN CHEATABLE DATA TRANSFER	seminar flower	1	1,466	19-09-2017, 11:05 AM Last Post: jaseela123
	Processing of collected data PPT	seminar projects maker	1	718	15-09-2017, 12:48 PM Last Post: jaseela123
	Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data pdf	study tips	1	2,018	13-09-2017, 12:59 PM Last Post: jaseela123
	Green Computing for Efficient use of Energy and Electronic Waste Minimization Report	project girl	1	1,357	12-09-2017, 12:37 PM Last Post: jaseela123
	Data Warehouse Report	study tips	1	879	12-09-2017, 12:23 PM Last Post: jaseela123
	Information Processing Using Transient Dynamics of Semiconductor Lasers Subject	seminar projects maker	1	797	11-09-2017, 04:41 PM Last Post: jaseela123
	CONFIDENTIAL DATA STORAGE AND DELETION details	seminar ideas	1	1,668	06-09-2017, 01:23 PM Last Post: jaseela123
	A Privacy-Preserving Remote Data Integrity Checking Protocol	seminar ideas	1	2,350	06-09-2017, 12:31 PM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.