Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: Probability Measure of Navigation pattern predition using Poisson Distribution Analys
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Probability Measure of Navigation pattern predition using Poisson
Distribution Analysis



[attachment=49867]

Abstract

The World Wide Web has become one of the most important media to store, share and distribute
information. The rapid expansion of the web has provided a great opportunity to study user and system behavior by
exploring web access logs. Web Usage Mining is the application of data mining techniques to large web data
repositories in order to extract usage patterns. Every web server keeps a log of all transactions between the server
and the clients. The log data which are collected by web servers contains information about every click of user to the
web documents of the site. The useful log information needs to be analyzed and interpreted in order to obtain
knowledge about actual user preferences in accessing web pages. In recent years several methods have been
proposed for mining web log data. This paper addresses the statistical method of Poisson distribution analysis to find
out the higher probability session sequences which is then used to test the web application performance.
The analysis of large volumes of click stream data demands the employment of data mining methods.
Conducting data mining on logs of web servers involves the determination of frequently occurring access sequences.
A statistical poisson distribution shows the frequency probability of specific events when the average probability of
a single occurrence is known. The Poisson distribution is a discrete function wich is used in this paper to find out the
probability frequency of particular page is visited by the user.



Introduction

Quantitative assessment of navigational behavior is a fundamental task to understand the phenomenon of web
navigations. Quantitative measures of user behavior will provide a better characterization of user navigation and this
will, in turn, suggest better ways of designing the structure of web sites. The information of web access paterns can
be generated from log files via a cleaning process, from which a set of navigation sessions or trails are identified.
Quantitative operations can be performed on session information which predicts important characterization of
navigation behavior. the complete web site usage statistics can be availed by analysing web site visitor profile and
access behavior.


Probability Evaluation of Log files using Poisson Distribution

A Poisson Process is a stochastic process which consists of a collection of (random) points in time. An example
of a Poisson process is the points of time where customers arrive in a shop. The concept of a Poisson process can be
generalised to processes with points in arbitrary sets (instead of points in time).
Poisson distribution is a discrete probability distribution that expresses the probability of a number of
events occurring in a fixed period of time if these events occur with a known average rate and independently of the
time since the last event. It gives theoretical probabilities and theoretical frequencies of a discrete variable. This
distribution can be applied when the happening of the event must be of two alternatives such as success or failure. It
is applicable when the number of trails ‘n’ is very large. Examples of events that may be modeled as a Poisson
distribution include: The number of phone calls at a call centre per minute, the number of times a web server is
accessed per minute and the number of mutations in a given stretch of DNA after a certain amount of radiation.

Conclusion
Appropriate metrics can provide useful characterizations of user web navigation behavior and can diagnose
a variety of problems. The ability to predict the chances of occurrences with precision would be extremely useful in
practice. The work proposes a probability analysis of web log file using Poisson distribution. The four days web log
transactions from 14.10.07 to 17.10.07 of Kongu Arts and Science College web server have been collected for the
Poisson probability analysis. The approach finds the probability and frequency of viewing every page in the website.
The figure 4.1 shows that the web pages “magazine.html”, course.html”, “biodept/bio.html”,
“aicte.html”,“phdhostory.html”, “mphilecon.html”, “cspgdept/mphilcs.html” have more probability value. Hence
the probability of occurrences of these pages in the future is higher than the other pages in the web site.