24-11-2012, 06:26 PM
On Flow Correlation Attacks and Countermeasures in Mix Networks
1.network.pdf (Size: 145.59 KB / Downloads: 64)
Abstract.
In this paper, we address issues related to flow correlation attacks and
the corresponding countermeasures in mix networks. Mixes have been used in
many anonymous communication systems and are supposed to provide countermeasures
that can defeat various traffic analysis attacks. In this paper, we focus
on a particular class of traffic analysis attack, flow correlation attacks, by which
an adversary attempts to analyze the network traffic and correlate the traffic of a
flow over an input link at a mix with that over an output link of the same mix.
Two classes of correlation methods are considered, namely time-domain methods
and frequency-domain methods. Based on our threat model and known strategies
in existing mix networks, we perform extensive experiments to analyze the performance
of mixes.We find that a mix with any known batching strategy may fail
against flow correlation attacks in the sense that for a given flow over an input
link, the adversary can correctly determine which output link is used by the same
flow.
Introduction
This paper studies flow correlation attacks and the corresponding countermeasures in
mix networks. With the rapid growth and public acceptance of the Internet as a means
of communication and information dissemination, concerns about privacy and security
on the Internet have grown. Although it can potentially be used for malicious purposes,
Anonymity is legitimate in many scenarios such as anonymous web browsing, E-Voting E-Banking, E-Commerce, and E-Auctions. In each of these scenarios, encryption alone
cannot achieve the anonymity required by participants [1, 2].
Since Chaum [3] proposed the mix network, researchers have developed various
anonymity systems for different applications. Although a significant amount of effort
has been put forth in researching anonymous communications, there has not been much
systematic study of the performance of mix networks in terms of anonymity degree provided
and quality-of-services maintained. This paper focuses on the quantitative evaluation
of mix performance. We are particularly interested in flow-based communication,
which is widely used in voice over IP, web browsing, FTP, etc. These applications may
have anonymity requirements, and the mixes are supposed to provide countermeasures
that can defeat traffic analysis attacks.
RelatedWork
Chaum [3] pioneered the idea of anonymity in 1981. Since then, researchers have applied
the idea to different applications such as message-based email and flow-based
low-latency communications, and they have invented new defense techniques as more
attacks have been proposed.
For anonymous email applications, Chaum [3] proposed to use relay servers, i.e.
mixes, rerouting messages, which are encrypted by public keys of mixes. An encrypted
message is analogous to an onion constructed by a sender, who sends the onion to the
first mix. Using its private key, the first mix peels off the first layer, which is encrypted
using the public key of the first mix. Inside the first layer is the second mix’s address and
the rest of the onion, which is encrypted with the second mix’s public key. After getting
the second mix’s address, the first mix sends the peeled onion. This process proceeds
in this recursive way. The core part of the onion is the receiver’s address and the real
message to be sent to the receiver by the last mix. Chaum also proposed return address
and digital pseudonyms for users to communicate with each other anonymously.
Helsingius [4] implemented the first Internet anonymous remailer, which is a single
application proxy that just replaces the original email’s source address with the
remailer’s address. It has no reply function and is subject to all the attacks mentioned
below. Eric Hughes and Hal Finney [5] built the cypherpunk remailer, a real distributed
mix network with reply functions that uses PGP to encrypt and decrypt messages. The
system is subject to a global passive attack and replay attack to its reply mechanism.
G¨ulc¨u and Tsudik [6] developed a relatively full-fledged anonymous email system, Babel.
Their reply technique does not need the sender to remember the secret seed to decrypt
the reply message, but it is subject to replay attack. They studied the threat from
the trickle attack, a powerful active attack. Another defect of Babel is that a mix itself
can differentiate the forwarding and replying messages. Cottrell [7] developed Mixmaster
which counters a global passive attack by using message padding and also counters
trickle and flood attacks [6, 8] by using a pool batching strategy. Mixmaster does not
have a reply function. Danezis, Dingledine and Mathewson [9] developed Mixminion.
Although Mixminion still has many problems, its design considers a relatively complete
set of attacks that researchers have found [8, 10–14]. The authors suggest a list of
research topics for future study.
Models
Mix and Mix Network
A mix is a relay device for anonymous communication.
Figure 1 shows the communication between users using one mix. A single mix can
achieve a certain level of communication anonymity: The sender of a message attaches
the receiver address to a packet and encrypts it using the mix’s public key. Upon receiving
a packet, a mix decodes the packet. Different from an ordinary router, a mix
usually will not relay the received packet immediately. Rather, it collects several packets
and then sends them out in a batch. The order of packets may be altered as well.
Techniques such as batching and reordering are considered necessary techniques for
mixes to prevent timing-based attacks. The main objective of this paper is to analyze
the effectiveness of mixes against a special class of timing-based attacks.
A mix network consists of multiple mixes that are inter-connected by a network.
A mix network may provide enhanced anonymity, as payload packets may go through
multiple mixes. Even in such a mix network, it is important that each individual mix
provides sufficient security and QoS so that the end-to-end performance can be guaranteed.
Thus, our analysis on a single mix provides a foundation for analyzing the
end-to-end performance of mix networks. We discuss in detail how to extend our work
to larger and complicated mix networks in [21]. In fact, if we view a mix network (for
example Onion routing [15]) as one super mix, the analytical techniques in this paper
can be directly applied.
Batching Strategies for a Mix
Batching strategies are designed to prevent not only simple timing analysis attacks but
also powerful trickle attacks, flood attacks, and many other forms of attacks ([9, 8]).
Serjantov [8] summarizes seven batching strategies that have been proposed. We will
evaluate each kind of these strategies. Our results show that these strategies may not
work under certain timing analysis attacks. These seven batching strategies are listed in
Table 1, in which batching strategies from S1 to S4 are denoted as simple mix, while
batching strategies from S5 to S7 are denoted as pool mix.
From Table 1, we can see that the sending of a batch of packets can be triggered
by certain events, e.g., queue length reaching a pre-defined threshold, a timer having a
time out, or some combination of these two.
Batching is typically accompanied by reordering. In this paper, the attacks focus on
the traffic characteristics.
Threat Model
In this paper, we assume that the adversary uses a classical timing analysis attack ([1,
22]), which we summarize as follows:
1. The adversary observes input and output links of a mix, collects the packet interarrival
times, and analyzes them. This type of attack is passive, since traffic is not
actively altered (by, say, dropping, inserting, and/or modifying packets during a
communication session), and is therefore often difficult to detect. This type of attack
can be easily staged on wired and wireless links [23] by a variety of agents,
such as malicious ISPs or governments ([24, 25]).
2. To maximize the power of the adversary, we assume that she makes observations
on all the links of the mix network.
3. The mix’s infrastructure and strategies are known to the adversary. This is a typical
assumption in the study of security systems. The above two assumptions create the
worst case in terms of security analysis.
4. The adversary cannot correlate (based on packet timing, content, or size) a packet
on a input link to another packet on the output link. Packet correlation based on
packet timing is prevented by batching, and correlation based on content and packet
size is prevented by encryption and packet padding, respectively.
Summary and FutureWork
We have analyzed mix networks in terms of their effectiveness in providing anonymity
and quality-of-service. Various methods used in mix networks were considered: seven
different packet batching strategies and two implementation schemes, namely the linkbased
batching scheme and mix-based batching scheme. We found that mix networks that use traditional batching strategies, regardless of the implementation scheme, are
vulnerable under flow correlation attacks. By using proper statistical analysis, an adversary
can accurately determine the output link used by traffic that comes to an input flow
of a mix. The detection rate can be as high as 100% as long as enough data is available.
This is true even if heavy cross traffic exists. The data collected in this paper should
give designers guidelines for the development and operation of mix networks.