03-09-2014, 09:42 AM
Adapting a Ranking Model for Domain-Specific Search
Adapting.pdf (Size: 335.41 KB / Downloads: 69)
Abstract-
An adaptation process is described to adapt a ranking
model constructed for a broad-based search engine for use with a
domain-specific ranking model. It’s difficult to applying the
broad-based ranking model directly to different domains due to
domain differences, to build a unique ranking model for each
domain it time-consuming for training models. In this paper,we
address these difficulties by proposing algorithm called ranking
adaptation SVM (RA-SVM), Our algorithm only requires the
prediction from the existing ranking models, rather than their
internal representations or the data from auxiliary domains The
ranking model is adapted for use in a search environment
focusing on a specific segment of online content, for example, a
specific topic, media type, or genre of content. a domain-specific
ranking model reduces search results to the data from a specific
domain that are relevant with respect to the search terms input by
the user. The ranking order may be determined with reference to
a given numerical score, an ordinal score, or a binary judgment
such as “relevant” or “irrelevant”.
INTRODUCTION
EARNING to rank is a kind of learning based information
retrieval techniques, specialized in learning a ranking model
with some documents labelled with their relevancies to some
queries, where the model is hopefully capable of ranking the
documents returned to an arbitrary new query automatically.
Based on various machine learning method, Ranking the learning
to rank algorithms have already shown their promising
performances in information retrieval, especially Web search.
However, as the emergence of domain-specific search engines,
more attentions have moved from the broad based search to
specific verticals, for hunting information constraint to a certain
domain. Different vertical search engines deal with different
topicalities, document types or domain-specific features. For
example, a medical search engine should clearly be specialized in
terms of its topical focus, whereas a music, image or video
search engine would concern only the documents in particular
formats
Ranking Adaptation SVM
It can be assumed that, if the auxiliary domain and the target
domain are related, their respective ranking func- tions f a and f
should have similar shapes in the function space Rs → R. Under
such an assumption, f a actually provides a prior knowledge for
the distribution of f in its parameter space. The conventional
regularization framework, such as Lp -norm regularization,
manifold regularization designed for SVM,regularized neural
network and so on, shows that the solution of an ill-posed
problem can be approximated from variational principle, which
contains both the data and the prior assumption.Consequently,
we can adapt the regularization framework which utilizes the f a
as the prior information, so that the ill-posed problem in the
target domain, where only few query document pairs are labeled,
can be solved elegantly. By modeling our assumption into the
regularization term, the learning problem of Ranking
Adaptation SVM (RA-SVM) can be formulated as:
RELATED WORK
We present some works that closely relate to the concept of
ranking model adaptation here. To create a ranking model that
can rank the documents according to their achine learning
techniques have been proposed. Some of them transform the
ranking problem into a pairwise classification problem, which
takes a pair of documents as a sample, with the binary label taken
as the sign of the relevance difference between the two
documents, e.g. Ranking SVM,RankBoost, RankNet and etc.
Some other methods including
ListNet,AdaRank,PermuRank,LambdaRank and etc., focus on
the structure of ranking list and the direct optimization of the
objective evaluation measures such as Mean Average Precision
(MAP) and Normalized Discounted Cumulative Gain (NDCG).
In this paper, instead of designing a new learning algorithm, we
focus on the adaptation of ranking models across different
domains based on the existing learning to rank algorithms.A lot
of domain adaptation methods have also been proposed to adapt
auxiliary data or classifiers to a new domain. Daume and
Marcu proposed a statistical formulation in terms of a mixture
model to address the domain distribution differences between
training and testing set. A boosting framework was also
presented for the similar problem . For natural language
processing, Blitzer and et al introduced a structural
correspondence learning method which can mine the
correspondences of features from different domains. For
multimedia application, Yang and et al.proposed Adaptive SVM
algorithm for the cross-domain video concept detection
problem. However, these works are mainly designed for
CONCLUSION
As various vertical search engines emerge and the amount
of verticals increases dramatically, a global ranking model,
which is trained over a dataset sourced from multiple domains,
cannot give a sound performance for each specific domain with
special topicalities, document formats and domain-specific
features. Building one model for each vertical domain is both
laborious for labeling the data and time-consuming for learning
the model.
In this paper, we propose the ranking model adaptation, to
adapt the well learned models from the broad-based search or
any other auxiliary domains to a new target domain. By model
adaptation, only a small number of samples need to be labeled,
and the computational cost for the training process is greatly
reduced. Based on the regularization framework, the Ranking
Adaptation SVM algorithm is proposed, which performs
adaptation in a black-box way, only the relevance predication of
the auxiliary ranking models is needed for the adaptation.
Based on, two variations called margin rescaling slack
rescaling are proposed to utilize the domain specific features to
further facilitate the adaptation, by assuming that similar
documents should have consistent rankings, and constraining the
margin and loss of RA-SVM adaptively according to their
similarities in the domain-specific feature space. Furthermore,
we propose ranking adaptability, to quantitatively measur
whether an auxiliary model can be adapted to a specific target
domain and how much assistance it can provide