Continuous online extraction of HTTP traces from packet traces pdf

**seminar projects maker** · 23-09-2013, 04:30 PM

Continuous online extraction of HTTP traces from packet traces

.pdf

Continuous online extraction .pdf (Size: 63.28 KB / Downloads: 34)

Introduction

To improve the performance of the network and the network protocol it is important to characterize the
dominant applications [4, 8, 9, 12, 19, 22, 23]. Only by utilizing data about all events initiated by the
Web (including TCP and HTTP events) can one hope to understand the chain of performance problems that
current Web users face. Due the the popularity of the Web it is crucial to understand how usage relates to the
performance of the network, the servers, and the clients. Such comprehensive information is only available
via packet monitoring. Unfortunately, extracting HTTP information from packet sniffer data is non-trivial
due to the huge volume of data, the line speed of the monitored links, the need for continuous monitoring,
and the need to preserve privacy. These needs translate into requirements for online processing and online
extraction of the relevant data, the topic of this paper.
The software described in this paper runs on the PacketScope monitor developed by AT&T Labs[1].
The PacketScope is deployed at several different locations within AT&T WorldNet, a production IP net-
work, and AT&T Labs-Research. One PacketScope monitors T3 backbone links, another PacketScope may
monitor traffic generated by a large set of modems on a FDDI ring or traffic on other FDDI rings, another
PacketScope monitors traffic between AT&T Labs-Research and the Internet. First deployed in Spring 1997,
the software has run without interruption for weeks at a time collecting and reconstructing detailed logs of
millions of Web downloads with less than a worst case of 0.3% packet loss.
The rest of this paper is organized as follows. Section 2 discusses the advantages of packet sniffing and
Section 3 outlines some of the difficulties of extracting HTTP data from packet traces. The overall software
architecture is described in Section 4. Our solution is presented in Section 5 and finally Section 6 briefly
summarizes some of the lessons learned.

Strength of packet monitoring

There are many ways of gaining access to information about user accesses to the Web:
from users running modified Web Browsers;
from Web content provider logging information about which data is retrieved from their Web server;
from Web proxies logging information about which data is requested by the users of the Web proxy;
from the wire via packet monitoring.
While each of these methods has its advantages most have severe limitations regarding the detail of
information that can be logged. Distributing modified Web browsers to a representative sample of consumers
and having them agree to monitor their browsing behavior is problematic, especially since Microsoft Internet
Explorer and Netscape' s browser became more popular than Mosaic and Lynx.

Packet Monitoring Software

The hardware and software design for the monitoring system was driven by the desire to gather continuous
traces without downtime on a high speed transmission medium. The monitor should be deployable even on
backbone links. Due to the asymmetric routing, common in todays Internet, backbone links may only see
packets of one direction of a TCP connections.
Hardware design: The hardware of the AT&T Packetscope [1] consists of standard hardware components,
a Dec Alpha 400 Mhz Workstation with a 8 Gigabyte Raid disk array and a 7 tape DLT tape robot. For more
details on the hardware architecture see Figure 1. Several security precaution have been taken, including
using no IP addresses and using read only device drivers. The Dec Alpha platform was chosen because of
the kernel performance optimizations to support packet sniffing by Mogul and Ramakrishnan [20].

Summary

The most important lesson is: expect the unexpected. It is crucial to avoid assumption about how well-
behaved either the clients or the servers or the network might be. They aren' t. Other common lessons from
the implementation include: don' t try to do too much processing in the time-critical steps of the logfile
extraction; simplify wherever sensible and reasonable; reduce memory use and disk I/O.
But in the end the most crucial lesson was to never expect a perfect logfile. There will always be one
more exception or one more misbehaved client/server. Therefore the analysis program should test whatever
assumption the data has to satisfy and should eliminate any data that violates the assumption. If one spends
enough care looking into possible reasons for exceptions the number of requests that are discarded by this
step is small.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Software Crisis pdf	study tips	1	2,117	21-09-2017, 04:31 PM Last Post: jaseela123
	HOW EMAIL WORKS pdf	project girl	1	3,067	20-09-2017, 11:39 AM Last Post: jaseela123
	Cyber crime detection, investigation and prosecution pdf	seminar projects maker	1	958	20-09-2017, 11:31 AM Last Post: jaseela123
	Review: Context Aware Tools for Smart Home Development pdf	study tips	1	1,227	20-09-2017, 11:22 AM Last Post: jaseela123
	Getting Started with the MAXQ1103 Evaluation Kit and the CrossWorks Compiler pdf	project girl	1	969	15-09-2017, 03:11 PM Last Post: jaseela123
	Wireless Application Protocol (WAP) pdf	project girl	1	1,531	15-09-2017, 02:42 PM Last Post: jaseela123
	MAC Protocol for Reliable Multicast over Multi-Hop Wireless Ad Hoc Networks pdf	study tips	1	1,029	15-09-2017, 12:39 PM Last Post: jaseela123
	Wireless Automotive Communications pdf	seminar projects maker	1	637	14-09-2017, 01:27 PM Last Post: jaseela123
	Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data pdf	study tips	1	2,018	13-09-2017, 12:59 PM Last Post: jaseela123
	Internetworking connectionless and connection-oriented networks pdf	project girl	1	1,151	13-09-2017, 11:03 AM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.