01-12-2012, 04:31 PM
A regression-based approach for mining user movement patterns from
random sample data
A regression-based approach.pdf (Size: 856.47 KB / Downloads: 24)
Abstract
Mobile computing systems usually express a user movement trajectory as a sequence of areas
that capture the user movement trace. Given a set of user movement trajectories, user
movement patterns refer to the sequences of areas through which a user frequently travels. In
an attempt to obtain user movement patterns for mobile applications, prior studies explore the
problem of mining user movement patterns from the movement logs of mobile users. These
movement logs generate a data record whenever a mobile user crosses base station coverage
areas. However, this type of movement log does not exist in the system and thus generates
extra overheads. By exploiting an existing log, namely, call detail records, this article proposes a
Regression-based approach for mining User Movement Patterns (abbreviated as RUMP). This
approach views call detail records as random sample trajectory data, and thus, user movement
patterns are represented as movement functions in this article. We propose algorithm LS
(standing for Large Sequence) to extract the call detail records that capture frequent user
movement behaviors. By exploring the spatio-temporal locality of continuous movements (i.e.,
a mobile user is likely to be in nearby areas if the time interval between consecutive calls is
small), we develop algorithm TC (standing for Time Clustering) to cluster call detail records.
Then, by utilizing regression analysis, we develop algorithm MF (standing for Movement
Function) to derive movement functions. Experimental studies involving both synthetic and
real datasets show that RUMP is able to derive user movement functions close to the frequent
movement behaviors of mobile users.
Introduction
Mobile services, such as navigation services, mobile search and location-aware services, are becoming very popular. These
wireless communication systems enable users to access various kinds of information from anywhere at any time. A mobile
computing system usually expresses a user movement trajectory as a sequence of areas in which the mobile user moves.1 In this
article, we aim at mining user movement patterns for a mobile user. Thus, given a user's set of movement trajectories, user
movement patterns refer to the sequences of areas that this user frequently travels. Analysis of user trajectory data could provide
some understandings and management of moving objects [1,2]. User movement patterns can be used to improve system
performance, such as designing personal paging area [3], and developing data allocation strategies [4–6], querying strategies [7],
and navigation services [8,9].
Extracting frequent movement behaviors from CDRs
As mentioned before, user movement patterns refer to the frequent movement behaviors of mobile users. However, the CDR
logs not only contain frequent user movement behaviors, but also include infrequent movement behaviors. For example, a user
usually goes to his office and is back to his home every weekday (as Fig. 1(a), (b) and © shows), and occasionally takes a trip (as
Fig. 1(d) shows). The frequent movement behavior is the trajectory from his home to his office, whereas a trip is an infrequent
movement behavior. Since regression analysis is sensitive to these infrequent CDRs, they should be eliminated. In other words, the
call detail records that capture the frequent movement behaviors of users should be extracted. To extract the frequent movement
behaviors of mobile users, we develop algorithm LS (standing for Large Sequence) to extract base stations whose coverage areas
are frequently visited by users.
Determining the number of regression functions
Once CDRs that capture the frequent movement behaviors have been extracted, it is necessary to determine how many
regression functions are needed. If only one regression function is derived, it may not be very close to the frequent user movement
behavior. Thus, given a set of call detail records of the frequent movement behavior, clustering techniques can be used to divide
call detail records into several groups. The number of groups is viewed as the number of regression functions. The movement
trajectories of mobile users generally follow spatio-temporal locality (i.e., if the time interval between two consecutive calls of a
mobile user is small, the mobile user is likely to have moved nearby). Therefore, the feature of spatio-temporal locality in
algorithm TC (standing for Time Clustering) can be used to group the call detail records with a close occurrence time.
Deriving movement functions
Location identification techniques typically use one of two location models: the geometric model and the symbolic models [11].
The geometric model specifies the location in n-dimensional coordinates (typically n=2 or 3). The symbolic model, however, uses
logical entities to describe the location. This article represents the location of mobile users in CDRs using the symbolic model (i.e.,
the base station identification). To derive movement functions of a mobile user, the location of the call detail records in the
symbolic model must be transformed into the geometric model. Then, with the cluster results obtained, we develop algorithm MF
(standing for Movement Function) for each cluster. This algorithm utilizes weighted regression analysis to derive the
corresponding movement functions of a user.
The RUMP approach consists of a series of algorithms that tackle the various issues described above. This study evaluates RUMP
performance using both synthetic and real datasets. Sensitivity analysis is conducted on several design parameters. Experimental
results show that RUMP is able to efficiently and effectively derive user movement patterns that capture the frequent movement
behaviors of mobile users.
Related works
The problem of mining user movement patterns has attracted a considerable amount of research effort. Prior studies are
generally classified into two categories based on their definitions of user movement patterns: spatial movement patterns and
spatio-temporal movement patterns. In the first category, a user movement pattern refers to a sequence consisting of base station
identifications or pre-defined regions. In the second category, user movement patterns represent the spatio-temporal associated
relationships among base station identifications or pre-defined regions.
In the first category, the authors in [12] proposed an information–theoretical method to mine user movement patterns and
represented them in a trie data structure. Moreover, the authors in [3] proposed a statistical approach to mine user movement
patterns. The authors of [13] and [4] proposed a data mining approach for mining user movement patterns based on the
movement logs of mobile users.
In the second category, user movement patterns are usually extracted from user trajectories, where trajectories are detailed
user movements. A considerable amount of research efforts focuses on mining spatio-temporal association rules [14–19]. The
authors in [20] explored the fuzziness of locations in patterns and developed algorithms to discover spatio-temporal sequential
patterns. Furthermore, the authors in [21] proposed a clustering-based approach to discover movement regions within time
intervals. In [22], the authors developed a hybrid prediction model, consisting of vector-based and pattern-based models, to
predict user movements. In [23] and [24], the authors exploited temporal annotated sequences in which sequences are associated
with time information (i.e., transition times between two movements).
Conclusions
User movement patterns can provide a lot of benefits in many mobile design schemes and applications, including designing a
paging area, developing data allocation schemes, conducting querying strategies, or offering navigation services. This article
proposes a regression-based approach called RUMP for mining user movement patterns from call detail records. To fully exploit the
fragmented spatio-temporal information hidden in such trajectories, the proposed regression-based solution discovers user
movement patterns. The RUMP approach uses three algorithms. First, algorithm LS extracts CDRs that reflect the frequent
movement behaviors of mobile users. By capturing similar movement sequences from call detail records, an aggregation
movement sequence is computed to represent the frequent movement behaviors of mobile users in each time slot. The feature of
spatio-temporal locality states that if the time interval between consecutive calls is small, the mobile user is likely to have moved
nearby. By exploring this feature, algorithm TC is able to determine the number of regression functions properly by clustering
those movement records whose time of occurrence are very close from an aggregation movement sequence. For each cluster of the
aggregation movement sequence, algorithm MF generates the movement functions representing user movement patterns of
mobile users.