05-10-2016, 03:10 PM
1457929063-semanticBasepaper.pdf (Size: 1.62 MB / Downloads: 4)
Abstract—Existing social networking services recommend friends to users based on their social graphs, which may not be the most
appropriate to reflect a user’s preferences on friend selection in real life. In this paper, we present Friendbook, a novel semantic-based
friend recommendation system for social networks, which recommends friends to users based on their life styles instead of social
graphs. By taking advantage of sensor-rich smartphones, Friendbook discovers life styles of users from user-centric sensor data,
measures the similarity of life styles between users, and recommends friends to users if their life styles have high similarity. Inspired by
text mining, we model a user’s daily life as life documents, from which his/her life styles are extracted by using the Latent Dirichlet
Allocation algorithm. We further propose a similarity metric to measure the similarity of life styles between users, and calculate users’
impact in terms of life styles with a friend-matching graph. Upon receiving a request, Friendbook returns a list of people with highest
recommendation scores to the query user. Finally, Friendbook integrates a feedback mechanism to further improve the
recommendation accuracy. We have implemented Friendbook on the Android-based smartphones, and evaluated its performance on
both small-scale experiments and large-scale simulations. The results show that the recommendations accurately reflect the
preferences of users in choosing friends.
INTRODUCTION
TWENTY years ago, people typically made friends with
others who live or work close to themselves, such as
neighbors or colleagues. We call friends made through this
traditional fashion as G-friends, which stands for geographical
location-based friends because they are influenced by
the geographical distances between each other. With the
rapid advances in social networks, services such as Facebook,
Twitter and Google+ have provided us revolutionary
ways of making friends. According to Facebook statistics, a
user has an average of 130 friends, perhaps larger than any
other time in history [2].
One challenge with existing social networking services is
how to recommend a good friend to a user. Most of them
rely on pre-existing user relationships to pick friend candidates.
For example, Facebook relies on a social link analysis
among those who already share common friends and recommends
symmetrical users as potential friends. Unfortunately,
this approach may not be the most appropriate based on recent sociology findings [16], [27], [29], [30]. According
to these studies, the rules to group people together include:
1) habits or life style; 2) attitudes; 3) tastes; 4) moral standards;
5) economic level; and 6) people they already know.
Apparently, rule #3 and rule #6 are the mainstream factors
considered by existing recommendation systems. Rule #1,
although probably the most intuitive, is not widely used
because users’ life styles are difficult, if not impossible, to
capture through web actions. Rather, life styles are usually
closely correlated with daily routines and activities. Therefore,
if we could gather information on users’ daily routines
and activities, we can exploit rule #1 and recommend
friends to people based on their similar life styles. This recommendation
mechanism can be deployed as a standalone
app on smartphones or as an add-on to existing social network
frameworks. In both cases, Friendbook can help mobile
phone users find friends either among strangers or within a
certain group as long as they share similar life styles.
In our everyday lives, we may have hundreds of activities,
which form meaningful sequences that shape our lives.
In this paper, we use the word activity to specifically refer to
the actions taken in the order of seconds, such as “sitting”,
“walking”, or “typing”, while we use the phrase life style to
refer to higher-level abstractions of daily lives, such as
“office work” or “shopping”. For instance, the “shopping”
life style mostly consists of the “walking” activity, but may
also contain the “standing” or the “sitting” activities.
To model daily lives properly, we draw an analogy
between people’s daily lives and documents, as shown in
Fig. 1. Previous research on probabilistic topic models in
text mining has treated documents as mixtures of topics,
and topics as mixtures of words [10]. Inspired by this, similarly,
we can treat our daily lives (or life documents) as a mixture of life styles (or topics), and each life style as a mixture
of activities (or words). Observe here, essentially, we
represent daily lives with “life documents”, whose semantic
meanings are reflected through their topics, which are life
styles in our study. Just like words serve as the basis of
documents, people’s activities naturally serve as the primitive
vocabulary of these life documents.
Our proposed solution is also motivated by the recent
advances in smartphones, which have become more and
more popular in people’s lives. These smartphones (e.g.,
iPhone or Android-based smartphones) are equipped with
a rich set of embedded sensors, such as GPS, accelerometer,
microphone, gyroscope, and camera. Thus, a smartphone is
no longer simply a communication device, but also a powerful
and environmental reality sensing platform from which
we can extract rich context and content-aware information.
From this perspective, smartphones serve as the ideal platform
for sensing daily routines from which people’s life
styles could be discovered.
In spite of the powerful sensing capabilities of smartphones,
there are still multiple challenges for extracting users’
life styles and recommending potential friends based on their
similarities. First, how to automatically and accurately discover
life styles from noisy and heterogeneous sensor data?
Second, how to measure the similarity of users in terms of life
styles? Third, who should be recommended to the user
among all the friend candidates? To address these challenges,
in this paper, we present Friendbook, a semantic-based friend
recommendation system based on sensor-rich smartphones.
The contributions of this work are summarized as follows:
To the best of our knowledge, Friendbook is the first
friend recommendation system exploiting a user’s life
style information discovered from smartphone sensors.
Inspired by achievements in the field of text mining,
we model the daily lives of users as life documents
and use the probabilistic topic model to extract life
style information of users.
We propose a unique similarity metric to characterize
the similarity of users in terms of life styles and
then construct a friend-matching graph to recommend
friends to users based on their life styles.
We integrate a linear feedback mechanism that
exploits the user’s feedback to improve recommendation
accuracy.
We conduct both small-scale experiments and largescale
simulations to evaluate the performance of our
system. Experimental results demonstrate the effectiveness
of our system.
The rest of the paper is organized as follows. Section 2
discusses related work. Section 3 provides the high-level
overview of Friendbook. Section 4 presents activity recognition
and life style modeling and extraction. In Section 5,
we describe the social graph construction and user impact
estimation. We elaborate on the user query and friend recommendation
in Section 6. We describe the feedback
mechanism in Section 7. In Section 8, we evaluate the performance
of Friendbook intensively with both simulations
and real experiments. Finally, we conclude the paper and
present the future work in Section 9.
2 RELATED WORK
Recommendation systems that try to suggest items (e.g.,
music, movie, and books) to users have become more and
more popular in recent years. For instance, Amazon [1] recommends
items to a user based on items the user previously
visited, and items that other users are looking at.
Netflix [3] and Rotten Tomatoes [4] recommend movies to a
user based on the user’s previous ratings and watching
habits. Recently, with the advance of social networking systems,
friend recommendation has received a lot of attention.
Generally speaking, existing friend recommendation
in social networking systems, e.g., Facebook, LinkedIn and
Twitter, recommend friends to users if, according to their
social relations, they share common friends.
Meanwhile, other recommendation mechanisms have
also been proposed by researchers. For example, Bian and
Holtzman [8] presented MatchMaker, a collaborative filtering
friend recommendation system based on personality
matching. Kwon and Kim [20] proposed a friend recommendation
method using physical and social context. However,
the authors did not explain what the physical and
social context is and how to obtain the information. Yu et al.
[32] recommended geographically related friends in social
network by combining GPS information and social network
structure. Hsu et al. [18] studied the problem of link recommendation
in weblogs and similar social networks, and proposed
an approach based on collaborative recommendation
using the link structure of a social network and contentbased
recommendation using mutual declared interests.
Gou et al. [17] proposed a visual system, SFViz, to support
users to explore and find friends interactively under the
context of interest, and reported a case study using the
system to explore the recommendation of friends based on
people’s tagging behaviors in a music community. These
existing friend recommendation systems, however, are significantly
different from our work, as we exploit recent sociology
findings to recommend friends based on their similar
life styles instead of social relations.
Activity recognition serves as the basis for extracting
high-level daily routines (in close correlation with life
styles) from low-level sensor data, which has been widely
studied using various types of wearable sensors. Zheng
et al. [33] used GPS data to understand the transportation
mode of users. Lester et al. [21] used data from wearable
sensors to recognize activities based on the Hidden Markov
Model (HMM). Li et al. [22] recognized static postures and
dynamic transitions by using accelerometers and gyroscopes.
The advance of smartphones enables activity recognition using the rich set of sensors on the smartphones.
Reddy et al. [26] used the built-in GPS and the accelerometer
on the smartphones to detect the transportation mode of
an individual. CenceMe [24] used multiple sensors on the
smartphone to capture user’s activities, state, habits and
surroundings. SoundSense [23] used the microphone on the
smartphone to recognize general sound types (e.g., music,
voice) and discover user specific sound events. EasyTracker
[7] used GPS traces collected from smartphones that are
installed on transit vehicles to determine routes served,
locate stops, and infer schedules.
Although a lot of work has been done for activity recognition
using smartphones, there is relatively little work on
discovery of daily routines using smartphones. The MIT
Reality Mining project [12] and Farrahi and Gatica-Perez
[14] tried to discover daily location-driven routines from
large-scale location data. They could infer daily routines
such as leaving from home to office and eating at a restaurant.
However, they could not discover the daily routines of
people who are staying at the same location. For instance,
when one stays at home, his/her daily routines like “eating
lunch” and “watching movie” could not be discovered if
only using the location information. In [13], Farrahi and
Gatica-Perez took a step further and overcame the shortcoming
of discovering daily routines of people staying in
the same location by considering combined location and
physical proximity sensed by the mobile phone. Another
closely related work was presented in [19], which used a
topic model to extract activity patterns from sensor data.
However, they used two wearable sensors, but not smartphones,
to discover the daily routines. In our work, we
attempt to use the probabilistic topic model to discover life
styles using the smartphone. We further utilize patterns discovered
from activities as a basis for friend recommendation
that helps users find friends who have similar life
styles. Note that the work in this paper is significantly different
from our preliminary demo work of Friendbook [31]
that recommended friends to users based on the similarity
of pictures taken by users.
3 SYSTEM OVERVIEW
In this section, we give a high-level overview of the Friendbook
system. Fig. 2 shows the system architecture of Friendbook which adopts a client-server mode where each
client is a smartphone carried by a user and the servers are
data centers or clouds.
On the client side, each smartphone can record data of its
user, perform real-time activity recognition and report the
generated life documents to the servers. It is worth noting
that an offline data collection and training phase is needed
to build an appropriate activity classifier for real-time activity
recognition on smartphones. We spent three months on
collecting raw data of eight volunteers for building a large
training data set. As each user typically generates around
50 MB of raw data each day, we choose MySQL as our low
level data storage platform and Hadoop MapReduce as our
computation infrastructure. After the activity classifier is
built, it will be distributed to each user’s smartphone and
then activity recognition can be performed in real-time
manner. As a user continually uses Friendbook, he/she will
accumulate more and more activities in his/her life documents,
based on which, we can discover his/her life styles
using probabilistic topic model.
On the server side, seven modules are designed to fulfill
the task of friend recommendation. The data collection module
collects life documents from users’ smartphones. The
life styles of users are extracted by the life style analysis module
with the probabilistic topic model. Then the life style
indexing module puts the life styles of users into the database
in the format of (life-style, user) instead of (user, lifestyle).
A friend-matching graph can be constructed accordingly
by the friend-matching graph construction module to
represent the similarity relationship between users’ life
styles. The impacts of users are then calculated based on the
friend-matching graph by the user impact ranking module.
The user query module takes a user’s query and sends a
ranked list of potential friends to the user as response. The
system also allows users to give feedback of the recommendation
results which can be processed by the feedback control
module. With this module, the accuracy of friend recommendation
can be improved.
In the following sections, we will elaborate on all the
components of the system.
4 LIFE STYLE EXTRACTION USING TOPIC MODEL
4.1 Life Style Modeling
As stated in Section 1, life styles and activities are reflections
of daily lives at two different levels where daily lives can be
treated as a mixture of life styles and life styles as a mixture
of activities. This is analogous to the treatment of documents
as ensemble of topics and topics as ensemble of
words. By taking advantage of recent developments in the
field of text mining, we model the daily lives of users as life
documents, the life styles as topics, and the activities as words.
Given “documents”, the probabilistic topic model could
discover the probabilities of underlying “topics”. Therefore,
we adopt the probabilistic topic model to discover the probabilities
of hidden “life styles” from the “life documents”.
In probabilistic topic models, the frequency of vocabulary is
particularly important, as different frequency of words
denotes their information entropy variances. Following this
observation, we propose the “bag-of-activity” model (Fig. 3)
to replace the original sequences of activities recognized