29-06-2012, 02:19 PM
Web Usage Mining: A Research Area in Web Mining
Web Usage Mining.pdf (Size: 182.73 KB / Downloads: 101)
Abstract
Web usage mining is a main research area in Web mining
focused on learning about Web users and their
interactions with Web sites. The motive of mining is to
find users’ access models automatically and quickly from
the vast Web log data, such as frequent access paths,
frequent access page groups and user clustering. Through
web usage mining, the server log, registration
information and other relative information left by user
access can be mined with the user access mode which will
provide foundation for decision making of organizations.
This article provides a survey and analysis of current
Web usage mining systems and technologies. This paper
also discusses an application of WUM , an online
Recommender System, that dynamically generates links to
pages that have not yet been visited by a user and might
be of his potential interest. Differently from the
recommender systems proposed so far, SUGGEST does
not make use of any off-line component, and is able to
manage Web sites made up of pages dynamically
generated.
Introduction
Web Mining is the extraction of interesting and
potentially useful patterns and implicit information from
artifacts or activity related to the World Wide Web. Web
usage mining provides the support for the web site design,
providing personalization server and other business
making decision, etc. In order to better serve for the users,
web mining applies the data mining, the artificial
intelligence and the chart technology and so on to the web
data and traces users' visiting characteristics, and then
extracts the users' using pattern[l]. It has quickly become
one of the most important areas in Computer and
Information Sciences because of its direct applications in
e-commerce, CRM, Web analytics, information retrieval
and filtering, and Web information systems.
Approach of Web usage mining
The web usage mining generally includes the following
several steps: data collection, data pretreatment,
knowledge discovery and pattern analysis.
A] Data collection:
Data collection is the first step of web usage mining, the
data authenticity and integrality will directly affect the
following works smoothly carrying on and the final
recommendation of characteristic service’s quality.
Therefore it must use scientific, reasonable and advanced
technology to gather various data. At present, towards
web usage mining technology, the main data origin has
three kinds: server data, client data and middle data (agent
server data and package detecting).
Online Web Personalization system
The main limitation of traditional Personalization systems
is the loosely coupled integration of the Web
personalization system with the Web server ordinary
activity. SUGGEST is completely online and incremental,
and it is aimed at providing the users with information
about the pages they may find of interest. It bases
personalization on a user’s classification that evolves
according to the user’s requests.
Usage information is represented by means of an
undirected graph whose nodes are associated to the
identifiers of the accessed pages, and each edge is
associated to a measure of the correlation existing
between nodes (pages). This graph is incrementally
modified to keep the user model up-to-date. In the model
the “interest” in a page does not depend on its contents
but on the order by which a page is visited during a
session. Therefore, to weight each edge of the graph we
introduced a novel formula:
Conclusion
Web usage mining model is a kind of mining to server
logs. Web Usage Mining plays an important role in
realizing enhancing the usability of the website design,
the improvement of customers’ relations and improving
the requirement of system performance and so on. Web
usage mining provides the support for the web site design,
providing personalization server and other business
making decision, etc. This paper discussed SUGGEST, an
online Recommender System that is based on an
incremental procedure, that is able to update
incrementally and automatically the knowledge base
obtained from historical usage data and to generate a list
of links to pages (suggestions) of potentially interest for
the user.