02-05-2013, 04:01 PM
Scalable Scheduling of Updates in Streaming Data Warehouses
Scalable Scheduling.doc (Size: 16.68 KB / Downloads: 44)
Abstract:
We present an alternative and more flexible approach that maximizes user utility by satisfying all users. It does this while minimizing the usage of system resources. We discuss the benefits of this latter approach and develop an adaptive monitoring solution Satisfy User Profiles (SUPs). Through formal analysis, we identify sufficient optimality conditions for SUP. The traditional method for ensuring resource allocation is to partition the job set and to schedule each partition separately, we empirically analyze the behavior of SUP under varying conditions. Our experiments show that we can achieve a high degree of satisfaction of user utility when the estimations of SUP closely estimate the real event stream, and has the potential to save a significant amount of system resources. we investigate two methods for ensuring resources for short jobs while still providing a degree of global scheduling.
Existing system
A variety of emerging online data delivery applications challenge existing techniques for data delivery to human users, applications, or middleware that are accessing data from multiple autonomous servers. The first approach, most commonly used nowadays, maximizes user utility under the strict setting of meeting a priori constraints on the usage of system resources.
Much of the existing research in pull-based data delivery casts the problem of data delivery as follows: Given some set of limited system resources, maximize the utility of a set of user profiles. We refer to this problem as OptMon1. To address some of the limitations of OptMon1, we propose a framework where we consider the dual of the previous optimization problem as follows: Given some set of user profiles, minimize the consumption of system resources while
satisfying all user profiles.
Proposed system
We proposed update chopping to avoid this kind of blocking by adding a degree of perceptibility to the jobs. Its variants have been proposed when tasks are detectable The diversity of data sources and Web services currently available on the Internet and the computational Grid, as well as the diversity of clients and application requirements, poses significant infrastructure challenges. In this paper, we address the task of targeted data delivery. Users may have specific requirements for data delivery, e.g., how frequently or under what conditions they wish to be alerted about update events or update values, or their tolerance to delays or stale information. The challenge is to deliver relevant data to a client at the desired time, while conserving system resources. We consider a number of scenarios including RSS news feeds, stock prices and auctions on the commercial Internet, and scientific data sets and Grid computational resources. We consider architecture of a proxy server that is managing a set of user profiles that are specified with respect to a set of remote autonomous servers. With this class of problems, user needs are set as the constraining factor of the problem, while resource consumption is dynamic and changes with needs. We present an optimal algorithm in the OptMon2 class, namely, Satisfy User Profiles (SUPs). SUP is simple yet powerful in its ability to generate optimal scheduling of pull requests. SUP is an online algorithm; at each time point, it can get additional requests for resource monitoring. Through formal analysis, we identify sufficient conditions for SUP to be optimal given a set of updates to resources.
ADVANTAGE
An advantage of using a simulation rather than a prototype of a streaming data warehouse is the ability to perform a very large number of tests in reasonable time and under precisely controlled conditions Excessive probing of these machines may increase their load and hurt their performance. Clearly, minimizing the number of probes to such a source is important to keep probing costs high.