Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: Improving Security and Performance in the Tor Network through Tunable Path Selection
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Improving Security and Performance in the Tor Network through Tunable Path Selection
[attachment=22718]
Abstract
The Tor anonymous communication network uses self-reported bandwidth values to select routers for building tunnels.
Since tunnels are allocated in proportion to this bandwidth, this allows a malicious router operator to attract tunnels for compromise.
Although Tor limits the self-reported bandwidth, it uses a high maximum value, effectively choosing performance over high anonymity
for all users. We propose a router selection algorithm that allows users to control the trade-off between performance and anonymity.
We also propose an opportunistic bandwidth measurement algorithm to replace self-reported values that is more sensitive to load and
more responsive to changing network conditions. Our mechanism effectively blends the traffic from users of different preferences,
making partitioning attacks difficult. We implemented the opportunistic measurement and tunable performance extensions and
examined their performance both through simulation and in the real Tor network. Our results show that users can get dramatic
increases in either performance or anonymity with little to no sacrifice in the other metric, or a more modest improvement in both. Our
mechanisms are also invulnerable to the previously published low-resource attacks on Tor.
Index Terms—Anonymous communication, bandwidth estimation, path selection.
Ç
1 INTRODUCTION
ANONYMOUS communication on the Internet seems finally
within reach. Though an initial commercial deployment
of Onion Routing [1], The Freedom Network [2], was
in the end shut down, a volunteer-run replacement network
using the second-generation onion routing design—Tor
[3]—has been operational for several years and has nearly
two thousand nodes [4] and several hundred thousand
users [5] as of late 2009. Tor is used by an increasing variety
of parties: reporters communicating with sources, dissidents
and embassies hiding their activities from local
governments [6], people trying to get around geographic
restrictions [7], and more. However, for the average user,
the performance penalty introduced by Tor is still prohibitively
high for everyday use. At the same time, the
popularity of Tor has lead to development of a number of
practical attacks on the system.
Efforts to improve the performance of the Tor network
can often decrease the anonymity, and vice versa. To
address this problem, we propose a user-tunable mechanism
for selecting routers based on their bandwidth capabilities.
Rather than trying to find a compromise that satisfies both
those users who desire strong anonymity protection and
those for whom performance is more of a priority, as is
done in the current Tor design, we suggest letting users
express a preference in the trade-off between anonymity
and performance and make router selections accordingly.
We design a mechanism that effectively blends the traffic of
users with different preferences, making partitioning
attacks difficult.
At the heart of our work is the Tor load-balancing
algorithm. Currently, Tor routers self-report their bandwidth
capabilities, and clients choose them in proportion to
their fraction of the overall Tor capacity. This enables a lowresource
attack, where routers misreport their bandwidth to
be the artificially high and thereby capture a large fraction
of tunnels [8]. Additionally, due to constantly changing
conditions, self-reported bandwidth is frequently an overestimate
of the actual node capacity, leading to unreliable
performance delivered to Tor users.
We propose to replace the Tor mechanism with an
opportunistic bandwidth measurement mechanism. Due to the
complete graph topology of the Tor network, each router
will have a chance to interact with most other routers and
thus observe their performance empirically. We show
through experiments that this mechanism is a suitable
replacement for self-reported bandwidth in that it accurately
predicts the performance of the routers and is
significantly less susceptible to low-resource attacks. Also,
since overutilized routers will show decreased performance,
it also helps reduce the long tail of the transfer
time distribution, making the worst case significantly better.
Our experiments with Tunable Tor show that users can
achieve great improvements in performance without
sacrificing much anonymity, or significantly increase
anonymity protection without any loss in performance.
They also allow for moderate improvements in both. This
improved flexibility should make Tor palatable to a wider
range of users, and thus increase anonymity for everyone
due to a larger community [9].
The remainder of this paper is structured as follows:
Section 2 examines the current implementation and points
out two important weaknesses. Section 3 analyzes these
weaknesses and proposes improvements to Tor to address
728 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 8, NO. 5, SEPTEMBER/OCTOBER 2011
. R. Snader is with Shook, Hardy & Bacon, L.L.P., 2555 Grand Blvd.,
Kansas City, MO 64108-2613. E-mail: rsnader[at]shb.com.
. N. Borisov is with the Department of Electrical and Computer
Engineering, University of Illinois at Urbana-Champaign, 460 Coordinated
Science Laboratory, 1308 West Main St., Urbana, IL 61822.
E-mail: nikita[at]illinois.edu.
Manuscript received 31 Mar. 2009; revised 24 Oct. 2009; accepted 1 Apr.
2010; published online 26 Aug. 2010.
For information on obtaining reprints of this article, please send e-mail to:
tdsc[at]computer.org, and reference IEEECS Log Number TDSC-2009-03-0045.
Digital Object Identifier no. 10.1109/2010.40.
1545-5971/11/$26.00  2011 IEEE Published by the IEEE Computer Society
them; it also evaluates these changes in isolation. Section 4
evaluates their performance in the real Tor network. Section 5
discusses some related work. Finally, Section 6 summarizes
our conclusions and examines future directions for this work.
2 WEAKNESSES IN THE IMPLEMENTATION OF TOR
We first present a high-level overview of the Tor network
design and then highlight two important problems in the
load-balancing algorithm. Interested readers can find more
details about the Tor protocol in [3].
2.1 Tor Design
The Tor network is based on an onion-routing [1] design,
where traffic is forwarded through several routers and
multiply encrypted, with each router removing one layer of
the encryption. The path through the network—a tunnel—is
constructed in a telescoping fashion so that each router
knows only the previous and the next router in the path. In
particular, the first (entry) router knows the source of the
tunnel, but not its destination, and the last (exit) router knows
the destination but not the source. However, if both routers
cooperate, they can use traffic analysis to link communication
over the same tunnel; hence there is little benefit to using long
paths and in practice Tor path length is set to 3.1
Tor routers are registered with a directory service. Each
router reports its IP address, public key, policies about what
traffic it will accept, and a bandwidth value that is
determined by monitoring the peak bandwidth achieved
by the router over a period of time. The directory service also
maintains statistics about the uptime of each router. The Tor
path construction algorithm, executed by the Tor client, will
first select all routers that have an acceptable forwarding
policy (e.g., many routers are unwilling to serve as exit
routers) and then choose a random router out of the list, with
the selection weighted by the reported bandwidth. This
way, traffic is roughly balanced across Tor nodes in
proportion to the bandwidth they have available. To prevent
a router from reporting an unreasonably high bandwidth, an
upper bound (currently 10 MB/s)2 is enforced.
To defend against the predecessor attack [10], recent
versions have introduced guard nodes, first described by
Wright et al. [11]. Each client picks a set of three nodes that
will be used as entry routers for all of its tunnels. Guard
nodes are chosen among stable nodes, i.e., nodes with a
high uptime that have a bandwidth higher than the median
bandwidth reported by all Tor nodes.
Fundamentally, Tor forms an overlay network for
forwarding traffic, and thus needs to address the performance
issues associated with this task. It also has an extra
requirement of preserving anonymity, making this task that
much more difficult. We next examine two shortcomings of
the Tor load-balancing scheme that motivate our work.
2.2 Advertised Bandwidth
The bandwidth values used in the load-balancing algorithm
are self-reported by each node and are not verified in
any way. This leaves the door open to attacks where
malicious nodes can report a higher-than-actual bandwidth
so that a larger fraction of tunnels are routed through them.
Despite the enforced upper bound, the attack can be quite
successful: Bauer et al. [8] report that a small fraction of
attacker nodes can attain the first and last node positions
(thus violating anonymity) on nearly half the tunnels,
despite using the older (and more secure) cap of 1.5 MB/s.
Even when nodes are honest, the reported values can be
a poor predictor of the available bandwidth at a node due to
changing network conditions and other factors. This makes
Tor performance highly variable. Recent studies of Tor (see
Fig. 1) show that, although the Tor network provides
reasonable bandwidth on most connections, the performance
curve has a long tail. In particular, while the median
bandwidth is 19 KB/s, the 90th percentile bandwidth is less
than a third of that, at 6 KB/s, and there is a significant
fraction of tunnels that perform worse still. This presents a
poor user experience, especially to users who are browsing
the web (the majority of connections in Tor [12]), with
connections frequently slowing down. Furthermore, comparing
these results with those from 2007 in Fig. 10 shows
that the situation is, if anything, getting worse over time
despite the influx of new routers.
2.3 User Heterogeneity
The Tor load-balancing algorithm provides a single, static
compromise between performance and anonymity. Users
who are highly anonymity sensitive (e.g., whistle blowers)
might wish to distribute all of the tunnels uniformly across
all routers, to prevent (purportedly) high-bandwidth
routers from having a higher chance of compromising their
traffic. Users who are less privacy-sensitive and are using
the network for casual web browsing (e.g., users who want
to hide their browsing activities from their neighbors) might
value performance more and would be more willing to use
high-bandwidth routers more often. By using the same path
selection algorithm for both of these, the Tor router
selection algorithm sacrifices the needs of both classes.3
SNADER AND BORISOV: IMPROVING SECURITY AND PERFORMANCE IN THE TOR NETWORK THROUGH TUNABLE PATH SELECTION 729
Fig. 1. Cumulative distribution function of the time required to transfer a
1 MB file over the Tor network in July 2009 (18,422 trials).
1. There are some small benefits to using 3 rather than 2, a full discussion
of which is beyond the scope of this paper.
2. On 30 August 2007, the Tor project released a version of Tor that
changed this upper bound to 10 MB/s from its previous value of 1.5 MB/s,
increasing network utilization at the cost of increased vulnerability to lowresource
routing attacks.
3. In fact, a recent discussion on the or-dev mailing list about raising the
upper bound of reported bandwidth hit precisely the stumbling block of not
being sure which of these two groups to serve [13]; the determination was
eventually made to significantly raise the limit.
3 PROPOSED IMPROVEMENTS
To address these issues, the fundamental questions of an
overlay network must be readdressed: first, how is the
performance of a router measured; and second, given a list
of measured routers, how is the route selected. In this work,
our performance metric is the bandwidth available to a Tor
tunnel rather than other performance characteristics such as
latency or jitter. Our reason for focusing on bandwidth is
threefold. First, bandwidth is already a key factor in Tor
design. Second, bandwidth is typically a property of a node
rather than a link between two nodes, since the bottleneck is
likely to be close to the node rather than in the intermediate
network [14], [15]. This makes measurements and optimizations
much more feasible than for link properties, since for
N nodes there are OðN2Þ links. Additionally, a scheme that
optimizes latency is bound to leak at least some information
about the starting point of a path, whereas it is possible to
optimize bandwidth without such information leaks.
Finally, the overwhelming majority of Tor traffic, by both
data volume and number of connections, is from web and
peer-to-peer traffic [12]—applications that are relatively
insensitive to jitter, and where latency can be treated simply
as a part of the total transfer time; when low bandwidth
makes this transfer time large, latency effects are negligible.
Finally, most latency in Tor comes from poor congestion
control handling; observed end-to-end latencies significantly
exceed the total network latency. Other work [16]
attempts to address this problem.
3.1 Router Measurement
A simple way to measure the available bandwidth at a router
is to perform a probe. Though crude, this mechanism is likely
to present a reasonably accurate picture of the performance of
a node; probing to determine node availability and therefore
reliability is done for high-latency anonymous-communication
networks by Echolot [17]. Fig. 2a shows the correlation
between probed router bandwidth and subsequently
achieved tunnel bandwidth in the real Tor network when
the probed router is the bottleneck router for that tunnel. The
probe results are a good predictor of tunnel performance;
however, probes themselves use up valuable bandwidth,
which is a scarce resource in the Tor network. In particular,
probes need to appear identical to real traffic, lest a node give
priority to probe traffic to enhance its performance, and thus
need to generate significant data transfers. Furthermore,
since it is not realistic for all nodes to probe all other nodes,
this task must be delegated to a small collection of probing
agents, which can act as a point of failure or compromise. For
these reasons, we consider bandwidth estimation via active
probing to be impractical.
We propose instead that opportunistic monitoring be
used to measure bandwidth capacity; that is, each router in
the Tor network keeps track of the bandwidth it has recently
seen for each of its peers. In practice, Tor routers communicate
with a large set of routers over a short period of
time—we observed up to 800 routers contacted within a
single day—and thus can accumulate a large set of statistics.
These statistics can then be aggregated by each router to a
single observation per peer and then uploaded to the
directory server (as the self-reported bandwidth is currently).
The directory server can in turn aggregate these
N2 observations into N router evaluations. The process for
these aggregations is discussed below in Sections 3.1.1 and
3.1.2. The final evaluations are then distributed by the
directory servers as the current self-reported evaluations are.
Fig. 2b shows the accuracy of bandwidth prediction for a
single router in the real Tor network; comparing this with
Fig. 2c, we can see that observation by a single client
approaches the accuracy of the advertised bandwidth figures
730 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 8, NO. 5, SEPTEMBER/OCTOBER 2011
Fig. 2. Accuracy of active probing, passive observation, and selfreporting
for bandwidth estimation in the real Tor network, as measured
by the log-log correlation ®. In all cases, the router under test is the
middle router, and the same entry and exit routers are used to minimize
confounding effects. (a) Accuracy of probing for bandwidth prediction in
the real Tor network ðr ¼ 0:629Þ. (b) Accuracy of passive observation for
bandwidth prediction in the real Tor network ðr ¼ 0:481Þ. © Accuracy of
advertised bandwidths for bandwidth prediction in the real Tor network
ðr ¼ 0:569Þ.
employed in the current Tor network; by aggregating
statistics across multiple Tor routers, a client can obtain an
even more accurate picture of the network, and at the same
time be less susceptible to attacks.
This approach has the additional advantage that, because
each peer considers the individual bandwidth it receives, an
overloaded router will produce lower measurements;
whereas the aggregate bandwidth observed by a router
will stay the same or even increase in the case of overload.
Using frequent updates of opportunistic measurements can
thus help balance the load by using overloaded routers less
frequently. In the extreme case, such dynamic load
balancing can cause route oscillations [18], but an update
frequency of about once a minute provides sufficient
damping that we do not anticipate that this will be a
problem; Figs. 3 and 4 support this claim.
We must also consider how to aggregate observations,
both multiple observations of a router by a single node, and
multiple observations of a router by multiple nodes. The
naı¨ve approach is for each node to use the maximum of its
own observations, but this approach has two flaws. First,
each router using its own view of the network creates the
possibility of partition attacks (described in [19]), where an
attacker focuses all of its bandwidth on nodes of interest.
Thus, these nodes, and only these nodes, are more likely to
select the attacking nodes when creating tunnels. Additionally,
aggregating observations via their maximum allows
“spotlight attacks” [20], where an attacker focuses all of its
bandwidth of one node at a time for a single measurement
interval, ignoring all other nodes. Assuming the maximum
age of measurements is long enough, the attacker can thus
convince the entire network that its bandwidth is many
times the actual value.
In the following sections, we discuss observation
aggregation and propose two novel methods of evaluating
node bandwidth before going on to evaluate their predictive
power.
3.1.1 Bandwidth Observation Aggregation
The first question to be addressed is that of aggregating
multiple observations of a single router by the same node. It
seems clear that we would like our aggregated value to be
as close to the actual bandwidth of that peer as possible;
taking the maximum observed value over a long interval
gives a high probability that there will be a measurement
when the measuring node is sole consumer of the router’s
bandwidth and therefore a fairly accurate value. However,
even for much shorter observation lifetimes such as the
current Tor value of one day, this approach is vulnerable to
spotlight attacks as described above.
Another possibility is using a moving average of recent
observations such as an exponentially weighted moving
average (EWMA):
Bnew ¼ ð1  ÞBold þ Bobs:
This way, if an attacker ignores a node for a sufficient period
of time, that node’s estimation of the attacker will drop and it
will be less likely to select that node. However, this approach
suffers from lessened accuracy in the face of bandwidth
fluctuations: a congested router that offers lower bandwidth
to its peers for a period of time will either have its reputation
fluctuate rapidly (for small values of ) or drop slowly and
recover slowly (for higher values of ). Neither of these
scenarios is good for the load balancing of the network.
We propose instead a variant on an EWMA that we call a
Min-Max Weighted Moving Average, or MWMA:
Bnew ¼ ð1  Þ maxðBold;BobsÞ þ  minðBold;BobsÞ:
SNADER AND BORISOV: IMPROVING SECURITY AND PERFORMANCE IN THE TOR NETWORK THROUGH TUNABLE PATH SELECTION 731
Fig. 3. The accuracy of various prediction mechanisms for a sample of
the trials. The x ¼ y line is included for reference. (a) Actual node
bandwidth versus achieved bandwidth (r ¼ 0:223). (b) Current Tor
bandwidth measurement versus achieved bandwidth (r ¼ 0:176).
© Median-of-five bandwidth measurement versus achieved bandwidth
(r ¼ 0:680). (d) EigenSpeed bandwidth measurement versus achieved
bandwidth (r ¼ 0:881).
This allows the bandwidth estimation to increase rapidly,
but still decay slowly if a router is providing poor service,4
combining the benefits of maximum-based aggregation
with those of EWMA-based aggregation.
3.1.2 Combining Measurements Across Hosts
As discussed previously, having each node rely solely on its
own observations to make router selection decisions leads to
the possibility of an attacker manipulating that node’s view
of the network and therefore their tunnel selection. In order
to prevent this, nodes can share their observations of their
peers. In this section, we present two methods of doing this.
1. For the first, we exploit the fact that, while an
attacker might be able to manipulate a node’s view
of the network, it must know which node to target in
order to do so. To prevent this foreknowledge, nodes
can ask a random peer for its view of the network.
Thus, an attacker might be able to affect the network
view of a node, but it will only affect the router
selection of a different, random node. By periodically
changing its source for network data, a node can be
reasonably certain that an attacker is not intentionally
altering its view of the network. However, there
remains the possibility that an attacker is blindly
presenting bogus data to anybody who asks or to
certain targets when they ask. More benignly, there
is the possibility that the router queried has recently
joined the network and has not yet had time to
gather observations for many of its peers. Happily,
the solution to both of these problems is the same:
instead of querying a single peer, a node queries
five5 and takes the median of the reported values.
This substantially reduces the probability of making
decisions on poor or malicious data. However, since
nodes are making routing decisions based on
different data, partition attacks remain theoretically
possible. This scheme also has the advantage of
being conceptually simple and simple to implement.
This is the method proposed in an earlier version of
this work [21].
2. A second approach is to use principal component
analysis to compute a consensus vector from the
N  N observation matrix. In our other work, we
have designed a secure bandwidth evaluation
system that accomplishes this task, called Eigen-
Speed [22]; we will present a summary of its
operation here. In EigenSpeed, a node updates its
own observation vector by incorporating the observations
of other nodes, weighted by their observed
bandwidth. The intuition behind this is that
nodes that have higher bandwidth capacity can
estimate other nodes’ capacities more accurately
since they are unlikely to be a bottleneck; furthermore,
weighting the observations from nodes that
have demonstrated a high bandwidth more highly
forces attackers to spend resources to attack the
system. This update process is iterated until the
process converges (to the principal eigenvector of
the observation matrix). We previously showed that
EigenSpeed requires a small number of iterations to
converge, and incorporated defenses from potential
attacks on the PCA algorithm by malicious nodes;
see [22] for more details.
In the next sections, we examine how these bandwidth
evaluation metrics perform and how they affect the
performance of the network as a whole.