Web Log Cleaning for Mining of Web Usage Patterns

**seminar flower** · 16-06-2012, 11:24 AM

Web Log Cleaning for Mining of Web Usage Patterns

.pdf

Web Log Cleaning for Mining.pdf (Size: 80.36 KB / Downloads: 67)
Abstract

Web usage mining (WUM) is a type of web mining,
which exploits data mining techniques to extract valuable
information from navigation behavior of World Wide Web
users. The data should be preprocessed to improve the
efficiency and ease of the mining process. So it is important to
define before applying data mining techniques to discover user
access patterns from web log. The main task of data
preprocessing is to prune noisy and irrelevant data, and to
reduce data volume for the pattern discovery phase.

INTRODUCTION

Web mining refers to the use of data mining techniques
to automatically retrieve, extract and analyze information
for knowledge discovery from Web documents and services.
The expansion of the World Wide Web (WWW) has
resulted in a large amount of data that is now in general
freely available for user access. The different types of data
have to be managed and organized in such a way that
different users can access them efficiently. Several data
mining methods are used to discover the hidden information
in the Web. Therefore, the application of data mining
techniques on the Web is now the focus of an increasing
number of researchers.

RELATED WORK

R.Cooley et al. 99 have clarified the preprocessing tasks
necessary for Web usage mining. Their approach basically
follows their steps to prepare Web log data for mining [1].
Mohammad Ala’a Al- Hamami et al described an efficient
web usage mining framework. The key ideas were to
preprocess the web log files and then classify this log file
into number of files each one represent a class, this
classification done by a decision tree classifier. After the
web mining processed on each of classified files and
extracted the hidden pattern they didn’t need to analyze
these discovered patterns because it would be very clear and
understood in the visualization level [2].

WEB USAGE MINING

Web Usage Mining (WUM) is the application of data
mining techniques to discover usage patterns from Web data.
In a general process of WUM, distinguish three main steps:
data preprocessing, pattern discovery and pattern analysis.
During preprocessing phase, raw Web logs need to be
cleaned, analyzed and converted before further pattern
mining. The data recorded in server logs, such as the user IP
address, browser, viewing time, etc, are available to identify
users and sessions. However, because some page views may
be cached by the user browser or by a proxy server, we
should know that the data collected by server logs are not
entirely reliable.

DATA PREPROCESSING

Preprocessing converts the raw data into the data
abstractions necessary for pattern discovery. The purpose of
data preprocessing is to improve data quality and increase
mining accuracy. Preprocessing consists of field extraction,
data cleansing. This phase is probably the most complex and
ungrateful step of the overall process.
This system only describe it shortly and say that its
main task is to ”clean” the raw web log files and insert the
processed data into a relational database, in order to make it
appropriate to apply the data mining techniques in the
second phase of the process.

CONCLUSION

Data preprocessing is an important task of WUM
application. Therefore, data must be processed before
applying data mining techniques to discover user access
patterns from web log. The data preparation process is often
the most time consuming. This paper presents two
algorithms for field extraction and data cleaning. Not every
access to the content should be taken into consideration. So
this system removes accesses to irrelevant items and failed
requests in data cleaning.

**seminar tips** · 30-10-2012, 11:53 AM

to get information about the topic "median filtering implementation in fpga" full report ppt and related topic refer the link bellow

https://seminarproject.net/Thread-web-us...g-what-why

https://seminarproject.net/Thread-web-usage-mining

https://seminarproject.net/Thread-web-lo...e-patterns

https://seminarproject.net/Thread-a-web-...-web-sites

http://seminarprojectsshowthread.php?tid=6380&google_seo=n98b+++&pid=102972#pid102972

https://seminarproject.net/Thread-a-web-...c-web-site

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Data Mining: What is Data Mining? Report	project girl	1	2,262	21-09-2017, 11:47 AM Last Post: jaseela123
	The Web Service Modeling Ontology (WSMO) ppt	seminar ideas	1	2,772	15-09-2017, 12:19 PM Last Post: jaseela123
	INCREMENTAL MINING USING FREQUENT PATTERN TREE	project topics	1	10,061,816	13-09-2017, 09:40 AM Last Post: jaseela123
	Usability of Semantic Web for Enhancing Digital Living Experience	seminar flower	1	2,695	11-09-2017, 04:39 PM Last Post: jaseela123
	multiple parameter for web service	seminar ideas	1	2,371	09-09-2017, 09:27 AM Last Post: jaseela123
	Web Spoofing Seminar PPT	project girl	1	3,100	02-09-2017, 02:50 PM Last Post: jaseela123
	The Web	project girl	1	1,675	02-09-2017, 01:45 PM Last Post: jaseela123
	Report on Data Mining Technique	study tips	1	986	31-08-2017, 12:45 PM Last Post: jaseela123
	Packet Route Tracer of Web Request PPT	study tips	1	1,560	29-08-2017, 11:36 AM Last Post: jaseela123
	Report on Web Search Engine	project girl	1	676	28-08-2017, 02:54 PM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.