27-06-2012, 03:46 PM
Exploring Application-Level Semantics for Data Compression
Exploring Application Level Semantics for Data Compression.pdf (Size: 2.35 MB / Downloads: 35)
INTRODUCTION
RECENT advances in location-acquisition technologies,
such as global positioning systems (GPSs) and wireless
sensor networks (WSNs), have fostered many novel
applications like object tracking, environmental monitoring,
and location-dependent service. These applications generate
a large amount of location data, and thus, lead to
transmission and storage challenges, especially in resourceconstrained
environments like WSNs. To reduce the data
volume, various algorithms have been proposed for data
compression and data aggregation [1], [2], [3], [4], [5], [6].
However, the above works do not address application-level
semantics, such as the group relationships and movement
patterns, in the location data.
Movement Pattern Mining
Agrawal and Srikant [18] first defined the sequential
pattern mining problem and proposed an Apriori-like
algorithm to find the frequent sequential patterns. Han et
al. consider the pattern projection method in mining
sequential patterns and proposed FreeSpan [19], which is
an FP-growth-based algorithm. Yang and Hu [9] developed
a new match measure for imprecise trajectory data and
proposed TrajPattern to mine sequential patterns. Many
variations derived from sequential patterns are used in
various applications, e.g., Chen et al. [20] discover path
traversal patterns in a Web environment, while Peng and
Chen [21] mine user moving patterns incrementally in a
mobile computing system.
Clustering
Recently, clustering based on objects’ movement behavior
has attracted more attention. Wang et al. [14] transform the
location sequences into a transaction-like data on users and
based on which to obtain a valid group, but the proposed
AGP and VG growth are still Apriori-like or FP-growthbased
algorithms that suffer from high computing cost and
memory demand. Nanni and Pedreschi [15] proposed a
density-based clustering algorithm, which makes use of an
optimal time interval and the average euclidean distance
between each point of two trajectories, to approach the
trajectory clustering problem.
PRELIMINARIES
Network and Location Models
Many researchers believe that a hierarchical architecture
provides better coverage and scalability, and also extends
the network lifetime of WSNs [25], [26]. In a hierarchical
WSN, such as that proposed in [27], the energy, computing,
and storage capacity of sensor nodes are heterogeneous. A
high-end sophisticated sensor node, such as Intel Stargate
[28], is assigned as a cluster head (CH) to perform high
complexity tasks; while a resource-constrained sensor node,
such as Mica2 mote [29], performs the sensing and low
complexity tasks.
EXPERIMENT AND ANALYSIS
We implement an event-driven simulator in C++ with SIM
[56] to evaluate the performance of our design. To the best of
our knowledge, no research work has been dedicated to
discovering application-level semantic for location data
compression. We compare our batch-based approach with
an online approach for the overall system performance
evaluation and study the impact of the group size (n), as
well as the group dispersion radius (GDR ), the batch period
(D), and the error bound of accuracy (eb). We also compare
our Replace algorithm with Huffman encoding technique to
show its effectiveness.