21-12-2012, 06:15 PM
PAM: An Efficient and Privacy-Aware Monitoring Framework for Continuously Moving Objects
PAM An Efficient.pdf (Size: 1.95 MB / Downloads: 17)
Abstract
Efficiency and privacy are two fundamental issues in moving object monitoring. This paper proposes a privacy-aware
monitoring (PAM) framework that addresses both issues. The framework distinguishes itself from the existing work by being the first to
holistically address the issues of location updating in terms of monitoring accuracy, efficiency, and privacy, particularly, when and how
mobile clients should send location updates to the server. Based on the notions of safe region and most probable result, PAM performs
location updates only when they would likely alter the query results. Furthermore, by designing various client update strategies, the
framework is flexible and able to optimize accuracy, privacy, or efficiency. We develop efficient query evaluation/reevaluation and safe
region computation algorithms in the framework. The experimental results show that PAM substantially outperforms traditional
schemes in terms of monitoring accuracy, CPU cost, and scalability while achieving close-to-optimal communication cost.
INTRODUCTION
IN mobile and spatiotemporal databases, monitoring continuous
spatial queries over moving objects is needed in
numerous applications such as public transportation, logistics,
and location-based services. Fig. 1 shows a typical
monitoring system, which consists of a base station, a
database server, application servers, and a large number of
moving objects (i.e., mobile clients). The database server
manages the location information of the objects. The application
servers gather monitoring requests and register spatial
queries at the database server, which then continuously
updates the query results until the queries are deregistered.
The fundamental problem in a monitoring system is
when and how a mobile client should send location updates
to the server because it determines three principal performance
measures of monitoring—accuracy, efficiency, and
privacy. Accuracy means how often the monitored results
are correct, and it heavily depends on the frequency and
accuracy of location updates. As for efficiency, two
dominant costs are: the wireless communication cost for
location updates and the query evaluation cost at the
database server, both of which depend on the frequency of
location updates. As for privacy, the accuracy of location
updates determines how much the client’s privacy is
exposed to the server.
RELATED WORK
There is a large body of research work on spatial temporal
query processing. Early work assumed a static data set and
focused on efficient access methods (e.g., R-tree [19]) and
query evaluation algorithms (e.g., [20], [37]). Recently, a lot
of attention has been paid to moving-object databases,
where data objects or queries or both of them move.
Assuming that object movement trajectories are known a
priori, Saltenis et al. [38] proposed the Time-Parameterized
R-tree (TPR-tree) for indexing moving objects, where the
location of a moving object is represented by a linear
function of time. Benetis et al. [3] developed query
evaluation algorithms for NN and reverse NN search based
on the TPR-tree. Tao et al. [41] optimized the performance
of the TPR-tree and extended it to the TPR-tree. Chon et al.
[10] studied range and kNN queries based on a grid model.
Patel et al. [34] proposed a novel index structure called
STRIPES using a dual transformation technique.
The work on monitoring continuous spatial queries can
be classified into two categories. The first category assumes
that the movement trajectories are known. Continuous kNN
monitoring has been investigated for moving queries over
stationary objects [40] and linearly moving objects [22], [36].
Iwerks et al. [22] extended to monitor distance semijoins for
two linearly moving data sets [23]. However, as pointed out
in [39], the known-trajectory assumption does not hold for
many application scenarios (e.g., the velocity of a car
changes frequently on road).
FUNDAMENTALS OF PAM FRAMEWORK
Privacy-Aware Location Model
In this paper, we assume that the clients are privacy
conscious. That is, the clients do not want to expose their
genuine point locations to the database server to avoid
spatiotemporal correlation inference attack [14], by which an
adversary may infer users’ private information such as
political affiliations, alternative lifestyles, or medical problems.
For example, knowing that a user is inside a heart
specialty clinic during business hours, the adversary can
infer that the user might have a heart problem. This has been
cited as a major privacy threat in location-based services and
mobile computing. To protect against it, most existing work
suggests replacing accurate point locations by bounding
boxes to reduce location resolutions [18], [14], [31], [26], [7],
[17]. With a large enough location box covering the sensitive
place (e.g., the clinic) as well as a good number of other
insensitive places, the success rate or confidence of such
spatiotemporal correlation inference can be reduced significantly.
In our monitoring framework, we take the same
privacy-aware approach. Specifically, each time a client
detects his/her genuine point location, it is encapsulated
into a bounding box. Then, the client-side location updater
decides whether or not to update that box to the server.1
Without any other knowledge about the client locations or
moving patterns, upon receiving such a box, the server can
only presume that the genuine point location is distributed
uniformly in this box.
The Object Index
The object index is the server-side view on all objects. More
specifically, to evaluate queries, the server must store the
spatial range, in the form of a bounding box, within which
each object can possibly locate. Note that this bounding box
is different from a -square because its shape also depends
on the client-side location updater. That is, it must be a
function (denoted by ) of the last updated -square and
the safe region. As such, this box is called a bbox as a mark
of distinction. In particular, for the standard update
strategy, the bbox is the safe region enlarged by =2 on
each side, or formally, the “Minkowski sum”2 of the safe
region and a =2-square.
With the same rationale for which we assume the
genuine point location of an updating object to distribute
uniformly in the -square, we assume that the genuine point
locations are distributed uniformly in their respective bboxes
when queries are evaluated or reevaluated. The object index
is built on the bboxes to speed up the evaluation. While
many spatial index structures can serve this purpose, this
paper employs the R-tree index [2], [19], which is most
widely adopted in the literature. Since the bbox changes each
time the object updates, the index is optimized to handle
frequent updates [29].
Query Processor and Location Manager
In the PAM framework, based on the object index, the query
processor evaluates the most probable result when a new
query is registered, or reevaluates the most probable result
when a query is affected by location updates. Obviously,
the reevaluation is more efficient as it can be based on
previous results. The detailed algorithms of query evaluation
and reevaluation will be presented in Section 4.
The location manager computes the safe region of an
object p (denoted as pr). Recall that a safe region is a
rectangle within which the change of the centroid of
p’s -square does not change the most probable result of
any registered query. As queries are independent of each
other, we can further define the safe region for a single
query Q (denoted as prQ) as a rectangle in which the
change of the centroid of p’s -square does not change the
most probable result of Q. By this definition, prQ is a
rectangular approximation, or more accurately an inscribed
rectangle, of Q’s inner (if p is a result object) or outer (if p is
a nonresult object) regions, which are separated by the
quarantine line. The reason why the safe region is based on
the quarantine line rather than the quarantine area is that
the latter is much coarser. Furthermore, the quarantine area
is used only to filter out the queries that are not affected by
a location update, so we trade accuracy for efficiency. The
safe region, on the other hand, directly dictates the
frequency, and hence, the cost of location updates, so we
compute it based on the more accurate quarantine line.
Safe Region for Range Query
We first consider the case when object p is a result object.
Fig. 6b shows an example of range query where q is the
centroid of the query. The gray box shows the -square of p.
Without loss of generality, let us consider the first quadrant,
and let the same p (ðx; yÞ) denote the centroid of the
-square. Fig. 6a is the close-up image of Fig. 6b. According
to the definition of the most probable result, more than half
of the -square must reside in the query window. To obtain
the quarantine line, we only need to consider the special
case when exactly half of the square resides in the query
window, which can be further divided into two subcases.