11-08-2012, 10:40 AM
A Personalized Ontology Model for Web Information Gathering
Personalized Ontology.pdf (Size: 1.51 MB / Downloads: 54)
Abstract
As a model for knowledge description and formalization, ontologies are widely used to represent user profiles in
personalized web information gathering. However, when representing user profiles, many models have utilized only knowledge from
either a global knowledge base or a user local information. In this paper, a personalized ontology model is proposed for knowledge
representation and reasoning over user profiles. This model learns ontological user profiles from both a world knowledge base and
user local instance repositories. The ontology model is evaluated by comparing it against benchmark models in web information
gathering. The results show that this ontology model is successful.
INTRODUCTION
ON the last decades, the amount of web-based information
available has increased dramatically. How to
gather useful information from the web has become a
challenging issue for users. Current web information
gathering systems attempt to satisfy user requirements by
capturing their information needs. For this purpose, user
profiles are created for user background knowledge
description [12], [22], [23].
User profiles represent the concept models possessed by
users when gathering web information. A concept model is
implicitly possessed by users and is generated from their
background knowledge. While this concept model cannot
be proven in laboratories, many web ontologists have
observed it in user behavior [23]. When users read through
a document, they can easily determine whether or not it is
of their interest or relevance to them, a judgment that arises
from their implicit concept models. If a user’s concept
model can be simulated, then a superior representation of
user profiles can be built.
RELATED WORK
Ontology Learning
Global knowledge bases were used by many existing
models to learn ontologies for web information gathering.
For example, Gauch et al. [12] and Sieg et al. [35] learned
personalized ontologies from the Open Directory Project to
specify users’ preferences and interests in web search. On
the basis of the Dewey Decimal Classification, King et al.
[18] developed IntelliOnto to improve performance in
distributed web information retrieval. Wikipedia was used
by Downey et al. [10] to help understand underlying user
interests in queries. These works effectively discovered user
background knowledge; however, their performance was
limited by the quality of the global knowledge bases.
Aiming at learning personalized ontologies, many works
mined user background knowledge from user local information.
Li and Zhong [23] used pattern recognition and
association rule mining techniques to discover knowledge
from user local documents for ontology construction. Tran
et al. [42] translated keyword queries to Description Logics’
conjunctive queries and used ontologies to represent user
background knowledge. Zhong [47] proposed a domain
ontology learning approach that employed various data
mining and natural-language understanding techniques.
Navigli et al. [28] developed OntoLearn to discover semantic
concepts and relations from web documents.
User Profiles
User profiles were used in web information gathering to
interpret the semantic meanings of queries and capture user
information needs [12], [14], [23], [41], [48]. User profiles
were defined by Li and Zhong [23] as the interesting topics
of a user’s information need. They also categorized user
profiles into two diagrams: the data diagram user profiles
acquired by analyzing a database or a set of transactions
[12], [23], [25], [35], [37]; the information diagram user
profiles acquired by using manual techniques, such as
questionnaires and interviews [25], [41] or automatic
techniques, such as information retrieval and machine
learning [30]. Van der Sluijs and Huben [43] proposed a
method called the Generic User Model Component to
improve the quality and utilization of user modeling.
Wikipedia was also used by [10], [27] to help discover user
interests. In order to acquire a user profile, Chirita et al. [6]
and Teevan et al. [40] used a collection of user desktop text
documents and emails, and cached web pages to explore
user interests. Makris et al. [24] acquired user profiles by a
ranked local set of categories, and then utilized web pages
to personalize search results for a user. These works
attempted to acquire user profiles in order to discover user
background knowledge.
PERSONALIZED ONTOLOGY CONSTRUCTION
Personalized ontologies are a conceptualization model that
formally describes and specifies user background knowledge.
From observations in daily life, we found that web
users might have different expectations for the same search
query. For example, for the topic “New York,” business
travelers may demand different information from leisure
travelers. Sometimes even the same user may have different
expectations for the same search query if applied in a
different situation. A user may become a business traveler
when planning for a business trip, or a leisure traveler when
planning for a family holiday. Based on this observation, an
assumption is formed that web users have a personal
concept model for their information needs. A user’s concept
model may change according to different information needs.
In this section, a model constructing personalized ontologies
for web users’s concept models is introduced.
World Knowledge Representation
World knowledge is important for information gathering.
According to the definition provided by [46], world
knowledge is commonsense knowledge possessed by
people and acquired through their experience and education.
Also, as pointed out by Nirenburg and Raskin [29],
“world knowledge is necessary for lexical and referential
disambiguation, including establishing coreference relations
and resolving ellipsis as well as for establishing and
maintaining connectivity of the discourse and adherence of
the text to the text producer’s goal and plans.” In this
proposed model, user background knowledge is extracted
from a world knowledge base encoded from the Library of
Congress Subject Headings (LCSH).
Ontology Construction
The subjects of user interest are extracted from the WKB via
user interaction. A tool called Ontology Learning Environment
(OLE) is developed to assist users with such interaction.
Regarding a topic, the interesting subjects consist of
two sets: positive subjects are the concepts relevant to the
information need, and negative subjects are the concepts
resolving paradoxical or ambiguous interpretation of the
information need. Thus, for a given topic, the OLE provides
users with a set of candidates to identify positive and
negative subjects. These candidate subjects are extracted
from the WKB.
Personalized Ontology.pdf (Size: 1.51 MB / Downloads: 54)
Abstract
As a model for knowledge description and formalization, ontologies are widely used to represent user profiles in
personalized web information gathering. However, when representing user profiles, many models have utilized only knowledge from
either a global knowledge base or a user local information. In this paper, a personalized ontology model is proposed for knowledge
representation and reasoning over user profiles. This model learns ontological user profiles from both a world knowledge base and
user local instance repositories. The ontology model is evaluated by comparing it against benchmark models in web information
gathering. The results show that this ontology model is successful.
INTRODUCTION
ON the last decades, the amount of web-based information
available has increased dramatically. How to
gather useful information from the web has become a
challenging issue for users. Current web information
gathering systems attempt to satisfy user requirements by
capturing their information needs. For this purpose, user
profiles are created for user background knowledge
description [12], [22], [23].
User profiles represent the concept models possessed by
users when gathering web information. A concept model is
implicitly possessed by users and is generated from their
background knowledge. While this concept model cannot
be proven in laboratories, many web ontologists have
observed it in user behavior [23]. When users read through
a document, they can easily determine whether or not it is
of their interest or relevance to them, a judgment that arises
from their implicit concept models. If a user’s concept
model can be simulated, then a superior representation of
user profiles can be built.
RELATED WORK
Ontology Learning
Global knowledge bases were used by many existing
models to learn ontologies for web information gathering.
For example, Gauch et al. [12] and Sieg et al. [35] learned
personalized ontologies from the Open Directory Project to
specify users’ preferences and interests in web search. On
the basis of the Dewey Decimal Classification, King et al.
[18] developed IntelliOnto to improve performance in
distributed web information retrieval. Wikipedia was used
by Downey et al. [10] to help understand underlying user
interests in queries. These works effectively discovered user
background knowledge; however, their performance was
limited by the quality of the global knowledge bases.
Aiming at learning personalized ontologies, many works
mined user background knowledge from user local information.
Li and Zhong [23] used pattern recognition and
association rule mining techniques to discover knowledge
from user local documents for ontology construction. Tran
et al. [42] translated keyword queries to Description Logics’
conjunctive queries and used ontologies to represent user
background knowledge. Zhong [47] proposed a domain
ontology learning approach that employed various data
mining and natural-language understanding techniques.
Navigli et al. [28] developed OntoLearn to discover semantic
concepts and relations from web documents.
User Profiles
User profiles were used in web information gathering to
interpret the semantic meanings of queries and capture user
information needs [12], [14], [23], [41], [48]. User profiles
were defined by Li and Zhong [23] as the interesting topics
of a user’s information need. They also categorized user
profiles into two diagrams: the data diagram user profiles
acquired by analyzing a database or a set of transactions
[12], [23], [25], [35], [37]; the information diagram user
profiles acquired by using manual techniques, such as
questionnaires and interviews [25], [41] or automatic
techniques, such as information retrieval and machine
learning [30]. Van der Sluijs and Huben [43] proposed a
method called the Generic User Model Component to
improve the quality and utilization of user modeling.
Wikipedia was also used by [10], [27] to help discover user
interests. In order to acquire a user profile, Chirita et al. [6]
and Teevan et al. [40] used a collection of user desktop text
documents and emails, and cached web pages to explore
user interests. Makris et al. [24] acquired user profiles by a
ranked local set of categories, and then utilized web pages
to personalize search results for a user. These works
attempted to acquire user profiles in order to discover user
background knowledge.
PERSONALIZED ONTOLOGY CONSTRUCTION
Personalized ontologies are a conceptualization model that
formally describes and specifies user background knowledge.
From observations in daily life, we found that web
users might have different expectations for the same search
query. For example, for the topic “New York,” business
travelers may demand different information from leisure
travelers. Sometimes even the same user may have different
expectations for the same search query if applied in a
different situation. A user may become a business traveler
when planning for a business trip, or a leisure traveler when
planning for a family holiday. Based on this observation, an
assumption is formed that web users have a personal
concept model for their information needs. A user’s concept
model may change according to different information needs.
In this section, a model constructing personalized ontologies
for web users’s concept models is introduced.
World Knowledge Representation
World knowledge is important for information gathering.
According to the definition provided by [46], world
knowledge is commonsense knowledge possessed by
people and acquired through their experience and education.
Also, as pointed out by Nirenburg and Raskin [29],
“world knowledge is necessary for lexical and referential
disambiguation, including establishing coreference relations
and resolving ellipsis as well as for establishing and
maintaining connectivity of the discourse and adherence of
the text to the text producer’s goal and plans.” In this
proposed model, user background knowledge is extracted
from a world knowledge base encoded from the Library of
Congress Subject Headings (LCSH).
Ontology Construction
The subjects of user interest are extracted from the WKB via
user interaction. A tool called Ontology Learning Environment
(OLE) is developed to assist users with such interaction.
Regarding a topic, the interesting subjects consist of
two sets: positive subjects are the concepts relevant to the
information need, and negative subjects are the concepts
resolving paradoxical or ambiguous interpretation of the
information need. Thus, for a given topic, the OLE provides
users with a set of candidates to identify positive and
negative subjects. These candidate subjects are extracted
from the WKB.