10-11-2012, 02:36 PM
The Design and Assessment of Web Data Analysis
System
The Design and Assessment of.pdf (Size: 197.15 KB / Downloads: 20)
INTRODUCTION
Web has become the major tool of e-commerce and it
requires internet corporations to provide personalized services
by tracking and analyzing users’ visit patterns. Web data
analysis system(WDAS) is a necessary tool supporting the web
data mining and knowledge discovery(DM & KD) process[1].
When conducting web data analysis(WDA), top priority should
be given to establish a reasonable WDAS. In the past, research
on DM & KD always focused attention on algorithm of a
specific mining, without the analysis on the establishment of a
whole system. WDAS is an organic system where different
parts have close connection with each other. Certain kind of
algorithm always serves for one data mining module. Without
careful analysis on system structure, redundant work among
various kinds of algorithms is an inevitable result. Only when
various kinds of algorithms connect with other module closely
can they display a full role[2].
THE FUNDAMENTAL GOAL OF WDAS DESIGN
A. Capable of analyzing a large number of data
In the web environment, there’s always a large number of
data with a high degree of complexity. The WDAS must be
capable of dealing with a huge number data and searching the
information which is a major concern of the users so as to
realize the transformation from original data to valuable
knowledge. Therefore, a major feature of WDAS is capable of
extracting, filtering, transforming and integrating a large
number of data so as to discover knowledge from them.
B. Capable of conducting analysis on many types of data
Web data have many different types, including structural
data, semistructural data as well as nonstructural data.
Therefore, WDAS must be capable of dealing with many types
of data with the compatibility of various systems.
C. Users are able to participate in the mining process
During the mining process, the users’ participating in this
process will provide support to the corresponding domain
knowledge. Therefore, this system should possess an interface
with a relatively friendly and interactive function.
THE BASIC FUNCTION OF WDAS
WDAS can make full use of effective data to provide many
statistics analysis and give subordinate support to increase the
work efficiency.
A. It helps to find out the specific pattern of web information
content so as to form valuable knowledge
The KD of web information content should focus on the
content mining and the analysis of its distribution and change
of subjects, the interconnection among the content of websites
with different themes. The multimedia data mining, which
gears towards different kinds of internet data, such as text data,
audio data, video data, graphic data and so no, can be divided
into two types of mining: mining based on text and on
multimedia. Text mining can summarize, classify, combine and
analyze a large amount of files. The multimedia mainly extracts
and mines the multimedia documentation in order to find out
potential knowledge.
B. It helps to find out the formation, characteristics of web
information as well as the routine of its changes so as to
increase the level of resource allocation
C. It helps to discover the features and rules of web users’
behavior so as to increase the web service level
Ability to solve complicated problem
The increase of data, the high demand on the precision of
the model will all result in the growth of complexity of the
problem. DM system can provide the following ways to solve
complicated problem. Multiple algorithms can generate
multiple patterns, especially the pattern related to classification
can be realized by different algorithms according to different
demands and environments. When the verification method is
used in assessing the patterns, there will always be many
possible ways to check. Data selection and transformation
patterns are always concealed by a large number of data items,
some of which are redundant and some are totally unrelated.
But this data will influence the discovery of valuable patterns.
A major function of DM system is to deal with the complex of
data, choose correct data item and transform data.
The degree for expansion
In order to increase the efficiency of dealing with a large
number of data, DM system should have the system expansion
ability. We need to know whether the DM system can fully
make use of the hardware resource, whether it can support
parallel performance and parallel computer, whether the
computing scale will increase with the increase of the number
of processors, whether it can support the parallel restore of the
data, whether the algorithm programmed by the single
processor of the computer can be operated in a more rapid
speed in the parallel computer, whether we need to program the
algorithm supported by parallel computer so as to fully display
parallel computer’s advantages.