01-02-2013, 10:20 AM
Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques
1Improving Aggregate.pdf (Size: 2.6 MB / Downloads: 58)
Abstract
Recommender systems are becoming increasingly important to individual users and businesses for providing personalized
recommendations. However, while the majority of algorithms proposed in recommender systems literature have focused on improving
recommendation accuracy (as exemplified by the recent Netflix Prize competition), other important aspects of recommendation quality,
such as the diversity of recommendations, have often been overlooked. In this paper, we introduce and explore a number of item
ranking techniques that can generate substantially more diverse recommendations across all users while maintaining comparable
levels of recommendation accuracy. Comprehensive empirical evaluation consistently shows the diversity gains of the proposed
techniques using several real-world rating data sets and different rating prediction algorithms.
INTRODUCTION
IN the current age of information overload, it is becoming
increasingly harder to find relevant content. This problem
is not only widespread but also alarming [28]. Over the last
10-15 years, recommender systems technologies have been
introduced to help people deal with these vast amounts of
information [1], [7], [9], [30], [36], [39], and they have been
widely used in research as well as e-commerce applications,
such as the ones used by Amazon and Netflix.
The most common formulation of the recommendation
problem relies on the notion of ratings, i.e., recommender
systems estimate ratings of items (or products) that are yet
to be consumed by users, based on the ratings of items
already consumed. Recommender systems typically try to
predict the ratings of unknown items for each user, often
using other users’ ratings, and recommend top N items
with the highest predicted ratings. Accordingly, there have
been many studies on developing new algorithms that can
improve the predictive accuracy of recommendations.
However, the quality of recommendations can be evaluated
along a number of dimensions, and relying on the accuracy
of recommendations alone may not be enough to find the
most relevant items for each user [24], [32]. In particular, the
importance of diverse recommendations has been previously
emphasized in several studies [8], [10], [14], [33], [46], [54],
[57].
RELATED WORK
Recommendation Techniques for Rating
Prediction
Recommender systems are usually classified into three
categories based on their approach to recommendation:
content-based, collaborative, and hybrid approaches [1], [3].
Content-based recommender systems recommend items
similar to the ones the user preferred in the past.
Collaborative filtering recommender systems recommend
items that users with similar preferences (i.e., “neighbors”)
have liked in the past. Finally, hybrid approaches can
combine content-based and collaborative methods in
several different ways. Recommender systems can also be
classified based on the nature of their algorithmic technique
into heuristic (or memory-based) and model-based approaches
[1], [9]. Heuristic techniques typically calculate
recommendations based directly on the previous user
activities (e.g., transactional data or rating values). One of
the commonly used heuristic techniques is a neighborhoodbased
approach that finds nearest neighbors that have
tastes similar to those of the target user [9], [13], [34], [36],
[40]. In contrast, model-based techniques use previous user
activities to first learn a predictive model, typically using
some statistical or machine-learning methods, which is then
used to make recommendations. Examples of such techniques
include Bayesian clustering, aspect model, flexible
mixture model, matrix factorization, and other methods [4],
[5], [9], [25], [44], [48].
Diversity of Recommendations
As mentioned in Section 1, the diversity of recommendations
can be measured in two ways: individual and aggregate.
Most of recent studies have focused on increasing the
individual diversity, which can be calculated from each user’s
recommendation list (e.g., an average dissimilarity between
all pairs of items recommended to a given user) [8], [33],
[46], [54], [57]. These techniques aim to avoid providing too
similar recommendations for the same user. For example,
some studies [8], [46], [57] used an intralist similarity metric
to determine the individual diversity. Alternatively, Zhang
and Hurley [54] used a new evaluation metric, item novelty,
to measure the amount of additional diversity that one item
brings to a list of recommendations. Moreover, the loss of
accuracy, resulting from the increase in diversity, is
controlled by changing the granularity of the underlying
similarity metrics in the diversity-conscious algorithms [33].
On the other hand, except for some work that examined
sales diversity across all users of the system by measuring a
statistical dispersion of sales [10], [14], there have been
few studies that explore aggregate diversity in recommender
systems, despite the potential importance of diverse
recommendations from both user and business perspectives,
as discussed in Section 1.
EMPIRICAL RESULTS
Data
The proposed recommendation ranking approaches were
tested with several movie rating data sets, including
MovieLens (data file available at grouplens.org), Netflix
(data file available at netflixprize.com), and Yahoo! Movies
(individual ratings collected from movie pages at movies.
yahoo.com). We preprocessed each data set to include users
and movies with significant rating history, which makes it
possible to have sufficient number of highly predicted items
for recommendations to each user (in the test data). The
basic statistical information of the resulting data sets is
summarized in Table 2. For each data set, we randomly
chose 60 percent of the ratings as training data and used
them to predict the remaining 40 percent (i.e., test data).
CONCLUSIONS AND FUTURE WORK
Recommender systems have made significant progress in
recent years and many techniques have been proposed to
improve the recommendation quality. However, in most
cases, new techniques are designed to improve the accuracy
of recommendations, whereas the recommendation diversity
has often been overlooked. In particular, we showed
that, while ranking recommendations according to the
predicted rating values (which is a de facto ranking standard
in recommender systems) provides good predictive accuracy,
it tends to perform poorly with respect to recommendation
diversity. Therefore, in this paper, we proposed a
number of recommendation ranking techniques that can
provide significant improvements in recommendation
diversity with only a small amount of accuracy loss. In
addition, these ranking techniques offer flexibility to system
designers, since they are parameterizable and can be used
in conjunction with different rating prediction algorithms
(i.e., they do not require the designer to use only some
specific algorithm).