Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: probabilistic packet marking (PPM) FULL REPORT
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
[attachment=12866]
CHAPTER 1
INTRODUCTION

The probabilistic packet marking (PPM) algorithm by Savage has attracted the most attention in contributing the idea of IP trace back. The most interesting point of this IP trace back approach is that it allows routers to encode certain information on the attack packets based on a predetermined probability. Upon receiving a sufficient number of marked packets, the victim can obtain the locations of the attackers.
1.1 THE PROBABILISTIC PACKET MARKING ALGORITHM
PPM algorithm is used to obtain a constructed graph. The constructed graph is the same as the attack graph. An attack graph is the set of paths, the attack packets traversed. A constructed graph is a graph returned by the PPM algorithm. To full fill this method for encoding the information of the edges of the attack graph into the attack packets through the cooperation of the routers in the attack graph and the victim site. Specifically, the PPM algorithm is made up of two separated procedures:
1. The packet marking procedure: The Packet Marking Procedure is executed on the router side.
2. The graph reconstruction procedure: The graph reconstruction procedure is executed on the victim side.
The packet marking procedure is designed to randomly encode edges information on the packets arriving at the routers. Then, by using the information, the victim executes the graph reconstruction procedure to construct the attack graph.
1.1.1 A Brief Review of the Packet Marking Procedure
The packet marking procedure aims at encoding every edge of the attack graph, and the routers encode the information in three marking fields of an attack packet:
1) The start 2) The end 3) The distance fields
How a packet stores the information about an edge in the attack graph, and the pseudocode of the procedure is shown in the below pseudo code
When a packet arrives at a router, the router determines how the packet can be processed based on a random number x (line number 1 in the pseudocode). If x is less than the predefined marking probability the router chooses to start encoding an edge. The router sets the start field of the incoming packet to the router’s address and resets the distance field of that packet to zero. Then, the router forwards the packet to the next router. When the packet arrives at the next router, the router again chooses if it should start encoding another edge.
Then, the router will discover that the previous router has started marking an edge, because the distance field of the packet is zero. Eventually, the router sets the end field of the packet to the router’s address and resets the distance field of that packet to the next router. When the packet arrives at the next router, the router again chooses if it should start encoding another edge.
Now, the start and the end fields together encode an edge of the attack graph. For this encoded edge to be received by the victim, successive routers should choose not to start encoding an edge, that is, the case x<pm in the pseudocode, because a packet can encode only one edge. Furthermore, every successive router will increment the distance field by one so that the victim will know the distance of the encoded edge.
1.1.2 Termination of the PPM Algorithm
According to the above description of the packet marking procedure, although a packet has already encoded an edge, successive routers may choose to start encoding another edge randomly. As a result, when a packet arrives at the victim, the packet may encode any of the edges of the attack graph, or a packet may not encode any edges. Therefore, if the victim can collect a sufficiently large number of marked packets, the victim can successfully construct all the paths in the attack graph by using the graph reconstruction procedure.
When the graph reconstruction procedure returns a constructed graph, it implies the termination of the PPM algorithm. However, the termination condition has not thoroughly been investigated in the literature. It turns out that the termination condition is important, because it determines the correctness of the constructed graph: If it stops too early, the constructed graph will not contain enough edges of the attack graph and, thus, fails to fulfill the trace back purpose. Let X be the number of marked packets required for the victim to reconstruct a path.
Let d be the length of the reconstructed path. In addition, let pm be the marking probability of every router in the path. The upper-bound on the expectation E[X] is given in [8, Equation (1)], and we name this equation the upper-bound equation throughout this paper
E[X]< ………………….(1)
1.2 PROBLEMS WHEN USING THE UPPER-BOUND EQUATION AS THE TERMINATION CONDITION
Although there is no explicit definition of the termination condition of the PPM algorithm (1) is the termination condition in the single-attack environment. In a multiple-attacker environment: The number of packets needed to reconstruct each path is independent, so the number of packets needed to reconstruct all paths is a linear function of the number of attackers.
However, we have found that this is not the case in general. More specifically, (1) should not be treated as the termination condition of the PPM algorithm.
1.2.1 Failure in the Multiple-Attacker Environment
First failure is one cannot apply the termination condition to complex networks such that the reconstruction of one path is dependent on another.
Fig 2 : 14 router binary tree network
This is explained in Fig. 2, which is a binary-tree network with 14 routers. The leaf routers from R7 to R14 are connected to a pool of attackers. These attackers send out attack traffic toward the victim v, and this presents a multiple-attacker environment. In this graph, the attack packets traversed through eight paths that are identical in structure. However, there are “shared” edges among these paths. This implies that the reconstruction of one path is dependent on another. Therefore, one cannot treat (1) as the termination condition under this scenario, and this restricts the application of the PPM algorithm.
Second, failure is although every path in a given network is independent, we have found that the number of marked packets needed to reconstruct the network graph does not have a linear relationship with the number of paths.
Fig. 3. A 12-router tree network with four independent linear paths, which is another multiple-attacker environment.
We have carried out a set of simulations to show our finding and we start the description of our simulation setup from the network depicted in Fig. 3. The network contains four paths that are identical in structure and, more importantly, there are no shared edges between any two paths. We name these paths the independent paths. In addition, we assume that one independent path connects to one attacker and every attacker sends out a similar amount of attack traffic toward the victim.
Fig. 4. The relationship between the number of independent paths and the average number of marked packets
Next, we repeat this simulation, we add one more independent path to the network, and there are now five independent paths. Eventually, we perform a series of simulations for one to 50 independent paths. Fig. 4 shows the result of this set of simulations. One can observe that the average number of marked packets required to construct a correct constructed graph increases as the number of independent paths increases. In order to show whether the number of required marked packets linearly increases with the number of paths or not, we plot the rate of change in the number of required marked packets in Fig. 5. the graph shows an increasing trend in the rate of change in the number of required marked packets.
Fig. 5. An increasing trend in the rate of change in the number of marked packets required.
Theoretically, the packet collecting problem can be transformed into the “coupon-collecting problem with unequal probabilities” The solution to the coupon-collection problem with unequal probabilities is very complex and does not show a linear property with the number of the independent paths. In summary, the first problem of using (1) as the termination condition is that the relationship between the number of attack paths and E[X] is not known. Therefore, the PPM algorithm cannot guarantee the correctness under the multiple-attacker environment.
1.2.2 Another Problem
No matter how accurate the calculation of the expectation E[X] is, one should not use the expected number of required marked packets E[X] as the termination condition. Depending on the underlying probability distribution of the random variable X, when the mean is reached, there is a non zero probability that the constructed graph is still an incorrect one. For instance, if the probability distribution of X is a uniform distribution, then the probability that a correct attack graph is constructed is just 0.5. In summary, when X has high variance, the first moment estimation may not be accurate.
Based on the above two problems, we conclude that the upper-bound equation is not suitable to be the termination condition of the PPM algorithm.
1.3 CONTRIBUTIONS AND PAPER STRUCTURE
We modify the PPM algorithm so that the victim can obtain a correct constructed graph with a specified level of guarantee. The contributions of this work are listed as follows:
• We introduce the termination condition of the PPM algorithm, which is missing or is not explicitly defined in the literature.
• Through the new termination condition, the user of the new algorithm is free to determine the correct- ness of the constructed graph.
• The constructed graph is guaranteed to reach the correctness assigned by the user, independent of the marking probability and the structure of the under- lying network graph.
CHAPTER 2
RECTIFIED PROBABILISTIC PACKET MARKING ALGORITHM

The modification of PPM algorithm is RPPM(Rectified PPM). The termination condition of RPPM is again expressed in terms of the number of collected marked packets, but the number changed based on the size of the constructed graph. The number of termination packet number is TPN.
The RPPM algorithm is designed to automatically determine when the algorithm should terminate. We aim at achieving the following properties:
1. The algorithm does not require any prior knowledge about the network topology.
2. The algorithm determines the certainty that the constructed graph is the attack graph when the algorithm terminates.
P* is the traceback confidence level the graph reconstruction procedure of the original PPM algorithm is completely replaced, and we name the new procedure the rectified graph reconstruction procedure.
2.1 ASSUMPTIONS
2.1.1 Assumptions about the Router

For each router, we assume that it is equipped with the ability to mark packets as in the original PPM algorithm. We also assume that each router shares the same marking probability. , Routers are two types 1) transit router 2) leaf router. A transit router is a router that forwards traffic from upstream routers to its downstream routers (or the victim). A leaf router is a router whose upstream router is connected to client computers (not routers) and forwards the clients’ traffic to its down- stream routers (or the victim). In addition, we assume that all leaf routers in an attack graph are the sources of the attack packets, and each leaf router sends out a similar number of attack packets. Here we are considering a multiple-attacker environment.
Furthermore, we assume that every router has only one outgoing route toward the victim. For the ease of presentation, we name the “outgoing route toward the victim” the victim route. This assumption is also reflected in the structures of the constructed graph: every router in the constructed graph has only one outgoing edge.
Fig. 6. The failure of the router R1 causes the route tables of R2 , R3 , and R4 to change. This results in a constructed graph with routers that have multiple outgoing edges.
For example, in Fig. 6, the failure of the router R1 forces the routing table to completely change. Under such a scenario, the constructed attack graph may become the one shown in Fig. 6c.
2.1.2 Assumptions about the Victim
On the victim side, we assume that by the time that the victim starts collecting marked packets, all routers in the network have already invoked the packet marking procedure. In addition, we assume that the victim does not have any knowledge about the real network or the attack graph. However, the victim knows the marking probability that the routers are using.
2.2 FLOW OF THE RECTIFIED GRAPH RECONSTRUCTION PROCEDURE
The pseudo code of the rectified graph reconstruction procedure is shown in Fig. 7, and the procedure is started as soon as the victim starts collecting marked packets. When a marked packet arrives at the victim, the procedure first checks if this packet encodes a new edge. If so, the
Fig. 7. The pseudocode of the rectified graph reconstruction procedure of the RPPM algorithm.
Fig. 8. An execution diagram of the rectified graph reconstruction procedure of the RPPM algorithm that constructs a graph with n edges.
Procedure accordingly updates the constructed graph Gc . Next, if the constructed graph is connected, where connected means that every router can reach the victim, the procedure calculates the number of incoming packets required before the algorithm stops, and we name this number the TPN. The procedure then resets the counter for the incoming packets to zero and starts counting the number of incoming packets. In the meantime, the procedure checks if the number of collected packets is larger than the TPN. If so, the procedure claims that the constructed graph Gc is the attack graph, with probability P* . Otherwise, the victim receives a packet that encodes a new edge. Then, the procedure updates the constructed graph, revisits the TPN calculation subroutine, resets the counter for incoming packets, and waits until a packet that encodes a new edge arrives or the number of incoming packets is larger than the new TPN.
As suggested by the pseudocode, the termination condition of the RPPM algorithm is that “the counter for the incoming packets is larger than the TPN,” and this implies that the calculation of the TPN during each update of the constructed graph is the core of the RPPM algorithm.
2.3 EXECUTION DIAGRAM OF THE RECTIFIED PROBABILISTIC PACKET MARKING ALGORITHM
According to the previous section, it is observed that the TPN, the constructed graph, and the execution of the rectified graph reconstruction procedure are closely related. Such a relationship can be visualized by the construction of the execution diagram, as shown in Fig. 8. The execution diagram presents the dynamics of the execution of the rectified graph reconstruction procedure.
2.3.1 Types of States
There are two types of states in the diagram: the execution state and the termination state. When the procedure is running, we say that “the rectified graph reconstruction procedure is in an execution state.” Otherwise, we say that “the rectified graph reconstruction procedure is in the termination state.” The execution state also tells us the state of the constructed graph:
1) When the procedure is in the start state, labeled by “0,” it means that the procedure has started running, and there are no edges in the constructed graph.
2) When the procedure is in a connected state, it means that the constructed graph is connected. A connected state, labeled by Ci , means that the constructed graph is connected and contains i edges.
3) When the procedure is in a disconnected state, the constructed graph is disconnected. A disconnected state, labeled by Di , means that the constructed graph is disconnected and contains i edges. Note that both the connected and disconnected states, say, Ci and Di , respectively, refer to all the possible graphs that have i edges. Last, when the procedure is in the termination state, it means that the procedure has stopped.
2.3.2 Types of Transitions
There are two kinds of transitions in the execution diagram. When the procedure takes a growth transition, it means that a new edge is added to the constructed graph. When the procedure takes a termination transition, it means that the procedure is going to stop running.
The transition structure in Fig. 8 is derived from the pseudocode of the rectified graph reconstruction procedure in Fig. 7. We briefly describe the transition structure as follows:
1) If a packet that encodes a new edge arrives before the number of received packets is larger than the TPN, then the procedure takes a growth transition and proceeds to either a connected state or a disconnected state, depending on the connectivity of the updated constructed graph.
2) If the number of received packets is larger than the TPN, then the procedure takes the termination transition and proceeds to the termination state.
3) If the procedure is in one of the disconnected states, then it is meaningless to return such a graph as the correct constructed graph, and there is no transition that connects the disconnected states to the termination state.