On Data Staging Algorithms for Shared Data Accesses in Clouds

**project maker** · 19-09-2014, 02:48 PM

On Data Staging Algorithms for Shared
Data Accesses in Clouds

.pdf

On Data.pdf (Size: 1.24 MB / Downloads: 53)

Abstract—

In this paper, we study the strategies for efficiently achieving data staging and caching on a set of vantage sites in a cloud
system with a minimum cost. Unlike the traditional research, we do not intend to identify the access patterns to facilitate the future
requests. Instead, with such a kind of information presumably known in advance, our goal is to efficiently stage the shared data items
to predetermined sites at advocated time instants to align with the patterns while minimizing the monetary costs for caching and
transmitting the requested data items. To this end, we follow the cost and network models in [1] and extend the analysis to multiple
data items, each with single or multiple copies. Our results show that under homogeneous cost model, when the ratio of transmission
cost and caching cost is low, a single copy of each data item can efficiently serve all the user requests. While in multicopy situation, we
also consider the tradeoff between the transmission cost and caching cost by controlling the upper bounds of transmissions and
copies. The upper bound can be given either on per-item basis or on all-item basis. We present efficient optimal solutions based on
dynamic programming techniques to all these cases provided that the upper bound is polynomially bounded by the number of service
requests and the number of distinct data items. In addition to the homogeneous cost model, we also briefly discuss this problem under
a heterogeneous cost model with some simple yet practical restrictions and present a 2-approximation algorithm to the general case.
We validate our findings by implementing a data staging solver, whereby conducting extensive simulation studies on the behaviors of
the algorithms.

PROBLEM FORMULATION

Suppose there are k distinct shared data items initially
stored at one node, say p1 and later migrated, replicated,
and cached in a fully connected network with m nodes
(p1; p2; :::; pm) to serve a sequence of requests ¼
4 12:::n in
which each i ¼
4 ðti; pi; RiÞ; 1 i n is specified by the
predicted access pattern and represents a request made for
a data subset Ri by node pi at time ti. We therefore have the
complete knowledge of all such information as an input to
our algorithms. Further, for simplicity we also assume that
there exists only one request per stage. However, for
multiple-request case we also briefly discuss it as a
supplemental result presented in Appendix A, available
in the online supplemental material.
To s

OPTIMAL DATA STAGING ALGORITHMS

In this section, with the homogeneous cost model, we first
present the multicopy algorithm to handle multiple distinct
data items, and then propose our extension to address the
constraints on the maximum number of transmissions and
copies. Note that we do not discuss the single-copy
algorithm separately as it can be treated as a special case
of multicopy algorithm

7 CONCLUSIONS

In this paper, we studied the problem of staging a set of
distinct data items in a fully connected network to facilitate
cloud-based services with minimum cost. To this end, we
investigated the optimal staging strategies based on the cost
models in the paper [1] to minimize the total staging cost.
We also considered some practical constraints on this
problem in terms of the maximum number of transmissions
and copies.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	data mining full report	project report tiger	37	374,184,749	16-03-2019, 05:22 PM Last Post: TitkinWY
	A Novel Data Embedding Method Using Adaptive Pixel Pair Matching Report	project girl	3	4,489	15-01-2018, 01:56 PM Last Post: dhanabhagya
	Detecting False Data in Wireless Sensor Network using Efficient Becan Scheme	seminar tips	1	3,235	20-09-2017, 01:03 PM Last Post: jaseela123
	Different Initialization Data and the Performance by the BFM	seminar flower	1	680	20-09-2017, 12:44 PM Last Post: jaseela123
	Wide Area Mobile Data Services	seminar ideas	1	2,373	19-09-2017, 02:35 PM Last Post: jaseela123
	Integrating and Designing the Data Mining Technique System Based on Customer	seminar projects maker	1	782	15-09-2017, 02:45 PM Last Post: jaseela123
	Survey of Privacy Protection for Medical Data	project maker	1	649	13-09-2017, 01:14 PM Last Post: jaseela123
	Using Rapid Prototyping Data to Enhance a Knowledge-Based Framework for Product Redes	smart paper boy	1	115,120	13-09-2017, 09:54 AM Last Post: jaseela123
	SECRET DATA HIDING IN IMAGE USING ENCRYPTION AND DECRYPTION KEY PPT	study tips	1	983	09-09-2017, 10:07 AM Last Post: jaseela123
	Parallelization of Genetic Algorithms and its Applications	project topics	1	9,185,345	02-09-2017, 12:28 PM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.