26-12-2012, 01:47 PM
QUERY PLANNING FOR CONTINUOUS AGGREGATION QUERIES OVER A NETWORK OF DATA AGGREGATOR
QUERY PLANNING.docx (Size: 14.43 KB / Downloads: 28)
ABSTRACT:
Continuous queries are used to monitor changes to time varying data and to provide results useful for online decision making. Typically a user desires to obtain the value of some aggregation function over distributed data items, for example, to know value of portfolio for a client; or the AVG of temperatures sensed by a set of sensors. In these queries a client specifies a coherency requirement as part of the query. We present a low-cost, scalable technique to answer continuous aggregation queries using a network of aggregators of dynamic data items. In such a network of data aggregators, each data aggregator serves a set of data items at specific coherencies. Just as various fragments of a dynamic web-page are served by one or more nodes of a content distribution network, our technique involves decomposing a client query into sub-queries and executing sub-queries on judiciously chosen data aggregators with their individual sub-query incoherency bounds.
We provide a technique for getting the optimal set of sub-queries with their incoherency bounds which satisfies client query’s coherency requirement with least number of refresh messages sent from aggregators to the client. For estimating the number of refresh messages, we build a query cost model which can be used to estimate the number of messages required to satisfy the client specified incoherency bound. Performance results using real-world traces show that our cost based query planning leads to queries being executed using less than one third the number of messages required by existing schemes.
EXISTING SYSTEM:
Many data intensive applications delivered over the Web suffer from performance and scalability issues. Content distribution networks (CDNs) solved the problem for static content using caches at the edge nodes of the networks. CDNs continue to evolve to serve more and more dynamic applications. The static fragments are served from the local caches whereas dynamic fragments are created either by using the cached data or by fetching the data items from the origin data sources. One important question for satisfying client requests through a network of nodes is how to select the best node(s) to satisfy the request. For static pages content requested, proximity to the client and load on the nodes are the parameters generally used to select the appropriate node.
Disadvantage:
1. For data item which needs to be refreshed at an incoherency.
2. The exact data value at the corresponding data source need not be reported as long as the query result satisfies user specified accuracy requirements.
PROPOSED SYSTEM:
Continuous queries are used to monitor changes to time varying data and to provide results useful for online decision making. This paper we present a low-cost, scalable technique to answer continuous aggregation queries using a content distribution network of dynamic data items.
Advantage:
1. It saves the time and the user spending low cost.
2. A continuous query cost model which can be used to estimate the number of messages required to satisfy the client specified incoherency bound.
3. We present to implementations of Continuous Aggregation in optimized query.