21-01-2013, 02:13 PM
A General Model of Probabilistic Packet Marking for IP Traceback
1A General Model.pdf (Size: 278.14 KB / Downloads: 47)
ABSTRACT
In this paper, we model Probabilistic Packet Marking (PPM)
schemes for IP traceback as an identi¯cation problem of a
large number of markers. Each potential marker is asso-
ciated with a distribution on tags, which are short binary
strings. To mark a packet, a marker follows its associated
distribution in choosing the tag to write in the IP header.
Since there are a large number of (for example, over 4,000)
markers, what the victim receives are samples from a mix-
ture of distributions. Essentially, traceback aims to identify
individual distribution contributing to the mixture. Guided
by this model, we propose Random Packet Marking (RPM),
a scheme that uses a simple but e®ective approach. RPM
does not require sophisticated structure/relationship among
the tags, and employs a hop-by-hop reconstruction similar to
AMS [16]. Simulations show improved scalability and trace-
back accuracy over prior works. For example, in a large
network with over 100K nodes, 4,650 markers induce 63%
of false positives in terms of edges identi¯cation using the
AMS marking scheme; while RPM lowers it to 2%. The ef-
fectiveness of RPM demonstrates that with prior knowledge
of neighboring nodes, a simple and properly designed mark-
ing scheme su±ces in identifying large number of markers
with high accuracy.
INTRODUCTION
In Distributed Denial of Service (DDoS) attacks, many
compromised hosts °ood the victim with an overwhelming
amount of tra±c. The victim's resources are exhausted and
services to users become unavailable. DDoS attacks par-
alyzed high-pro¯le web sites, including Yahoo, CNN and
Amazon, for hours to days in February 2000 [8]. In January
2001, DDoS attack was launched against the DNS server of
Microsoft and in October 2002, DDoS attack brought down
eight root DNS servers [12]. Most recently, in February 2007,
a DDoS attack was launched against the DNS root servers
in two phases, lasting for more than 7 hours in total [2].
During a DDoS attack, attack nodes often perform ad-
dress spoo¯ng to avoid detection. An IP traceback mech-
anism aims to overcome address spoo¯ng and uncover the
attack paths or sources. While traceback is motivated by
DDoS attacks, it also bene¯ts analysis of legitimate tra±c.
Potential applications of traceback include tra±c account-
ing and network bottleneck identi¯cation. In Probabilistic
Packet Marking (PPM), routers probabilistically mark the
packets they transmit, so that the victim can trace the at-
tack paths up to their sources, based on the packets it re-
ceived[14]. A packet is marked by writing to the reusable
bits in the IP header. We call the strings written as tags.
RELATEDWORK
Existing traceback schemes can be classi¯ed into two cat-
egories. 1. Routers are queried on the tra±c they have
forwarded. The routers may not need to log packets. 2.
The receiver locally reconstructs the attack paths from a
collection of packets. Each packet carries partial path infor-
mation. The packets are either probabilistically marked by
routers or specially generated for traceback.
The ¯rst category includes online query and variations of
hash based logging schemes [15, 10]. The second category in-
cludes variants of probabilistic packet marking (PPM) [14],
ICMP traceback (iTrace) [5], and algebraic encoding [7]. In
iTrace, routers sample packets with a small probability. A
sampled packet is duplicated in an ICMP packet, plus in-
formation of the router's upstream or downstream neighbor
forming an edge with itself. Based on the ICMP packets, the
victim reconstructs the attack paths by linking up the edges.
Note that routers farther away generates fewer iTrace pack-
ets to the victim. A variant of iTrace, called intention-driven
iTrace [11], introduces an intension indicator to inform re-
mote routers to raise their probability in generating iTrace
packets.
Components of PPM
The operations of a PPM traceback scheme can generally
be divided into the following components: marking of pack-
ets by the routers, choice of tags used and reconstruction
using information from marked packets by the victim. In
the following, for each of these components, we ¯rst present
the general idea and then highlight the design of RPM.
Marking by Routers
In our model, each marker is associated with a distribu-
tion D on the L-bit tags. Such associations are pre-assigned
and ¯xed throughout the marking and identi¯cation process.
Consider a marker with identity i and its assigned distribu-
tion Di. When it receives a packet, the marker chooses with
probability ², an L-bit tag s according to the distribution
Di and mark the packet with s.
The probability ² is a parameter that is the same for every
marker. It is possible that some packets arrive at the victim
without being marked. We assume that the bits in those un-
marked packets are random and are uniformly distributed.
Since the marking process needs to be very e±cient, sam-
pling from the distribution Di must be a simple operation.
Thus, in RPM and other related work, only uniform distri-
bution on a ¯nite set is considered.
Entropy of Packet Marks
One measure of the quality of a chosen Di is the entropy
of the mixture of distributions received by the victim. In-
tuitively, uniformly random packet marks deliver highest
entropy. Higher entropy carries more bits of information,
which can reduce the false negatives ratio ® and the false
positives ratio ¯. Entropy measure provides a means to
evaluate the performance of PPM schemes. A good marking
scheme should strive to achieve high entropy packet marks.
Some existing schemes trade o® some entropy in the tags
for easy path reconstruction; but they underperform in ef-
fectiveness. In particular, we will show that the use of hop
count to indicate distance from markers to the victim results
in lower entropy than uniformly random bits.
Identification and Reconstruction Effort
From marked packets, the victim wants to reconstruct the
attack paths. This can be done by ¯rst identifying the mark-
ers, and then deriving the paths. Exhaustive enumeration of
all subsets of markers, and estimating their respective packet
rate is infeasible. To aid identi¯cation, structured informa-
tion can be embedded into the tags. This information facil-
itates association between packet marks and the marker, or
links di®erent packet marks generated by the same marker.
There are generally two assumptions made in the reconstruc-
tion, no network topology information available or available
of (partial) topology information.
CONCLUSION
IP traceback has been an actively researched DDoS de-
fence. For attack packets with spoofed source addresses, IP
traceback traces the paths they traverse up to the sources.
Traceback also bene¯ts tra±c accounting applications, such
as tracking clients' bandwidth utilization, or locating the
bottleneck links in the network.
In this paper, we present a general model for PPM schemes.
The general model provides a platform for PPM schemes
comparison and helps to identify the appropriate system pa-
rameters. We also show that entropy is a good predictor of
traceback accuracy and use of hop count information in the
tag reduces the entropy
We present a PPM scheme called RPM that has good
traceback accuracy and e±cient path reconstruction. Sim-
ulations show improved scalability and traceback accuracy
over prior works. For example, a thousand attack paths in-
duce 63% of false positives in terms of edges identi¯cation,
using AMS. RPM lowers the false positives to 2%. The e®ec-
tiveness of RPM demonstrated that imposing sophisticated
structures on tags is not necessary. If imposing the structure
reduces the randomness in tags, it should be avoided. The
improvement of RPM over prior schemes is mainly a result
of increasing the information carried in packet marks.