14-02-2013, 12:54 PM
Query Planning for Continuous Aggregation Queries over a Network of Data Aggregators
Query Planning for Continuous.pdf (Size: 457.15 KB / Downloads: 34)
Abstract—
Continuous queries are used to monitor changes to time varying data and to provide results useful for online
decision making. Typically a user desires to obtain the value of some aggregation function over distributed data items, for
example, to know value of portfolio for a client; or the AVG of temperatures sensed by a set of sensors. In these queries a client
specifies a coherency requirement as part of the query. We present a low-cost, scalable technique to answer continuous
aggregation queries using a network of aggregators of dynamic data items. In such a network of data aggregators, each data
aggregator serves a set of data items at specific coherencies. Just as various fragments of a dynamic web-page are served by
one or more nodes of a content distribution network, our technique involves decomposing a client query into sub-queries and
executing sub-queries on judiciously chosen data aggregators with their individual sub-query incoherency bounds. We provide a
technique for getting the optimal set of sub-queries with their incoherency bounds which satisfies client query’s coherency
requirement with least number of refresh messages sent from aggregators to the client. For estimating the number of refresh
messages, we build a query cost model which can be used to estimate the number of messages required to satisfy the client
specified incoherency bound. Performance results using real-world traces show that our cost based query planning leads to
queries being executed using less than one third the number of messages required by existing schemes.
Index Terms—Algorithms, Continuous queries, Distributed query processing, Data dissemination, Coherency, Performance.