09-08-2012, 04:18 PM
SPAF: Stateless FSA-based Packet Filters
SPAF Stateless FSA-based Packet Filters.pdf (Size: 1.04 MB / Downloads: 49)
Abstract
We propose a stateless packet filtering technique
based on Finite-State Automata (FSA). FSAs provide a comprehensive
framework with well-defined composition operations that
enable the generation of stateless filters from high-level specifications
and their compilation into efficient executable code without
resorting to various opportunistic optimization algorithms. In
contrast with most traditional approaches, memory safety and
termination can be enforced with minimal run-time overhead
even in cyclic filters, thus enabling full parsing of complex
protocols and supporting recursive encapsulation relationships.
Experimental evidence shows that this approach is viable and
improves the state of the art in terms of filter flexibility, performance
and scalability without incurring in the most common
FSA deficiencies, such as state space explosion.
I. INTRODUCTION
PACKET filters are a class of packet manipulation programs
used to classify network traffic in accordance to
a set of user-provided rules; they are a basic component
of many networking applications such as shapers, sniffers,
demultiplexers, firewalls and more.
The modern networking scenario imposes many requirements
on packet filters, mainly in terms of processing speed (to
keep up with network line rates) and resource consumption (to
run in constrained environments). Filtering techniques should
also support modern protocol formats that often include cyclic
or repeated structures (e.g. MPLS label stacks, IPv6 extension
headers). Finally it is also crucial that filters preserve the
integrity of their execution environment, both in terms of
memory access safety and termination enforcement, especially
when running as an operating system module or on the
bare hardware. Although at first sight this aspect might not
seem crucial, it is a fact that many of the limitations built
into existing packet filters derive directly from safety issues:
as an example, the impossibility of automatically proving
termination for a generic computer program led the BPF [1]
designers to generate acyclic filters only, thus preventing the
parsing of packets with multiple levels of encapsulation or
repeated field sequences.
Existing packet filters focus invariably on subsets of these
issues but, to the best of our knowledge, do not solve all of
them at the same time: as an example, two widely known
generators, BPF+ [2] and PathFinder [3], do not support
recursive encapsulation; NetVM-based filters [4], on the other
hand, have no provision for enforcing termination, either in
filtering code or in the underlying virtual machine.
This paper presents SPAF (Stateless PAcket Filter), a FSAbased
technique to generate fast and safe packet filters that
The authors are with Politecnico di Torino, Dipartimento di Automatica e
Informatica, e-mail: fpierluigi.rolando, riccardo.sisto, fulvio.rissog[at]polito.it
This work has been partially supported by The Cisco University Research
Program Fund, a corporate advised fund of Silicon Valley Community
Foundation.
are also flexible enough to fully support most layer 2 to
layer 4 protocols, including optional and variable headers and
recursive encapsulation. The proposed technique specifically
targets the lower layers of the protocol stack and does not
directly apply for deep packet inspection nor for stateful
filtering in general. Moreover, for the purpose of this paper
we consider only static situations where on-the-fly rule set
updates are not required. While these limitations exclude some
interesting use cases, SPAF filters are nevertheless useful for a
large class of applications, such as monitoring and traffic trace
filtering, and can serve as the initial stage for more complex
tools such as intrusion detection systems and firewalls.
A stateless packet filter can be expressed as a set of
predicates on packet fields, joined by boolean operators; often
these predicates are not completely independent from one
another and the evaluation of the whole set can be shortcircuited.
One of the most important questions in designing
generators for high-performance filters is therefore how to
efficiently organize the predicate set to reduce the amount of
processing required to come to a match/mismatch decision. By
considering packet filtering as a regular language recognition
problem and exploiting the related mathematical framework
to express and organize predicates as finite-state automata,
SPAF achieves by construction a reduction of the amount
of redundancy along any execution path in the resulting
program: any packet field is examined at most once. This
property emerges from the model and it always holds even
in cases that are hard to treat with conventional techniques,
such as large-scale boolean composition. Moreover, thanks
to their simple and regular structure, finite automata also
double as an internal representation directly translatable into
an optimized executable form without requiring a full-blown
compiler. Finally, safety (both in terms of termination and
memory access integrity) can be enforced with very low runtime
overhead.
The rest of this paper is structured as follows: Section II
presents an overview of the main related filtering approaches
developed to this date. Section III provides a brief introduction
to the FSAs used for filter representation and describes the
filter construction procedure. Section IV focuses on executable
code generation and on enforcing the formal properties of
interest, while Section V presents the experimental evidence
collected to evaluate the new approach and to support our
claims. Finally, Section VI reports conclusions and also highlights
possible future developments.
II. RELATED WORKS
Given their wide adoption and relatively long history, there
is a large corpus of literature on packet filters. A first class
of filters is based on the CFG paradigm; the best-known and
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
2
most widely employed one is probably BPF [1], the Berkeley
Packet Filter. BPF filters are created from protocol descriptions
hardcoded in the generator and are translated into a bytecode
listing for a simple, ad-hoc virtual machine. The bytecode
was originally interpreted, leading to a considerable run-time
overhead impact which can be reduced by employing JIT
techniques [5]. BPF disallows backward jumps in filters in
order to ensure termination, thus forgoing support for e.g.
IPv6 extension headers; memory protection is enforced by
checking each access at run-time. Multiple filter statements can
be composed together by boolean operators but in the original
BPF implementation only a small number of optimizations are
performed over predicates, leading to run-time inefficiencies
when dependent or repeated predicates are evaluated. Two
relevant BPF extensions are BPF+ and xPF. BPF+ [2] adds
local and global data-flow optimization algorithms that try to
remove redundant operations by altering the CFG structure.
xPF [6] relaxes control flow restrictions by allowing backward
jumps in the filter CFG; termination is enforced by limiting
the maximum number of executed instructions through a runtime
watchdog built into the interpreter but its overhead was
not measured and extending this approach to just-in-time code
emission has not been proposed and might prove difficult.
A further CFG-based approach, unrelated to BPF, is described
in [4]. Its main contribution is decoupling the protocol
database from the filter generator by employing an XMLbased
protocol description language, NetPDL [7]. Filtering
code is executed on the NetVM [8], a special-purpose virtual
machine targeting network applications that also provides an
optimizing JIT compiler that works both on filter structure and
low-level code. The introduction of a high-level description
language reportedly does not cause any performance penalties;
this approach, however, delegates all safety considerations to
the VM and does not provide an effective way to compose
multiple filters.
In general CFG-based generators benefit from their flexible
structure that does not impose any significant restriction on
predicate evaluation order; for the same reason, however, they
are prone to the introduction of hard-to-detect redundancies,
leading to multiple unnecessary evaluations if no further
precautions are taken. Even when optimizers are employed
and are experimentally shown to be useful, they work on an
opportunistic basis and seldom provide any hard guarantees
on the resulting code.
A second group of filter generators chooses tree-like structures
to organize predicates. PathFinder [3] transforms predicates
into template masks (atoms), ordered into decision trees.
Atoms are then matched through a linear packet scan until
a result is reached. Decision trees enable an optimization
based on merging prefixes that are shared across multiple
filters. PathFinder is shown to work well both in software
and hardware implementations, but it does not take protocol
database decoupling into consideration and no solution to
memory safety issues is proposed for the software implementation.
FSA-based filters share a degree of similarity
with PathFinder as packets are also scanned linearly from
the beginning to the end but predicate organization, filter
composition and safety considerations are handled differently.
DPF [9] improves over PathFinder by generating machine
code just-in-time and adding low-level optimizations such as a
flexible switch emission strategy. Moreover, DPF is capable
of aggregating bounds checks at the atom level by checking
the availability of the highest memory offset to be read instead
of considering each memory access in isolation; our technique,
described in Section IV-E, acts similarly but considers the filter
as a whole, thus further reducing run-time overhead.
While organizing predicates into regular structures makes it
easier to spot redundancies and other sources of overhead, it
also introduces different limitations: as an example, generators
restricted to the aforementioned acyclic structures do not fully
support tunneling or repeated protocol portions. Moreover, it
has been noted that performing prefix coalescing is not sufficient
to catch certain common patterns, resulting in redundant
predicate evaluation [2].
A third approach is to consider packet filtering as a language
recognition problem. Jayaram et al. [10] use a pushdown
automaton to perform packet demultiplexing; filters are expressed
as LALR(1) grammars and can be therefore effectively
composed using the appropriate rules. This solution improves
filter scalability but there are downsides related to the pushdown
automaton: a number of specific optimizations are
required to achieve good performance. It is also quite unwieldy
to express protocols and filter rules as formal grammars that
must be kept strictly unambiguous: the authors marginally note
that the simpler FSA model would be sufficient for the same
task.
The adoption of finite state automata in packet processing is
not new, as finite automata are already used to perform deep
payload inspection especially in intrusion detection systems
(IDS) [11] [12] [13] [14] [15] [16]. While the general approach
is similar, the practical issues involved in deep payload
inspection can be rather different from those encountered when
employing FSAs for layer 2 to layer 4 filtering. As an example,
payload regular expressions tend to be free-form and often
present patterns that negatively affect the size of the automata
resulting from composition, causing a so-called state space
explosion [16] [17]. This issue is arguably the limiting factor to
FSA adoption for payload inspection and it certainly deserves
investigation; however, our scenario might be different enough
(protocol headers often have a much more rigid structure) so
that its impact is mitigated. Among many FSA-based packet
processors the nearest match for SPAF is probably Ruler [18],
a packet rewriting system designed for anonymizing traffic
traces that can also be used for packet filtering. It is based
around an NFA extension that supports rewriting and its
engine is specifically tailored to Intel IXP network processors.
Being automata-based, Ruler shares a degree of similarity
with SPAF but its design goals are sufficiently different to
produce noticeably different final results. Ruler’s design aims
at supporting large free form regular expression sets and
the modifications required for rewriting; less attention has
been paid to aspects that are necessary for achieving good
layer-2 to layer-4 filtering performance, thus trailing SPAF
in experimental comparisons (Section V). Safety issues are
also not taken into account conveniently (memory bounds
are checked at each access) and its source language is not
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
3
general enough to specify complex filter statements or certain
commonly encountered protocol structures, such as the IPv6
extension headers: while it is possible to resort to regular
expressions that represent the entire packet structure, this
formulation presents an avoidable layer of complexity for the
final user.
Apart from the specialized solutions for fast packet filtering
mentioned above, one of the most widely used packet
filtering programs is the NetFilter framework1. NetFilter is a
component of the Linux kernel that performs packet filtering,
firewalling, mangling operations (e.g. network address translation)
and more, acting through a set of hooks and callbacks
that intercept packets as they traverse the networking stack. In
contrast with all the aforementioned approaches, NetFilter uses
the relatively simple method of applying all the specified rules
in sequence when performing packet filtering, leading to poor
performance and scalability; moreover it appears not possible
to specify an arbitrary predicate, filters being limited to preset
protocols and statements that are specialized by specifying
actual network addresses and ports.
Besides the generation technique, there have also been
improvements along other dimensions such as architectural
considerations, as demonstrated by xPF, FFPF [19] and
nCap [20], or dynamic rule sets support, as shown by the
SWIFT tool [21]. We consider these aspects out of scope
for the purpose of this paper, being either orthogonal to the
technique we present or object of future works.
III. FILTER GENERATION TECHNIQUE
The purpose of a stateless packet filter generator is to create
a program that, given a finite-length byte sequence (a packet)
as its input, returns a binary match/mismatch decision. The
input of the generator itself consists of a set of filter rules
provided by the user that specify the desired properties of
matching packets; each rule, in turn, consists of multiple
predicates expressed in a simple high-level language (where
header fields and protocols appear symbolically), combined
together with boolean operators. In older generators the set of
supported protocols was fixed; in modern ones protocol header
formats are kept into an external database that can be updated
without modifying the generator.
In order to develop a successful FSA-based filtering technique
it is first of all necessary to show that any filter of
interest can be expressed as a finite automaton, then provide a
method to transform a high-level filter statement and a protocol
database into FSA form; finally, the resulting automaton must
be translated into an efficiently executable form.
A. Finite-State Automata
A Finite-State Automaton (FSA) is a quintuple
(A; S; s0; t; F), where A is an alphabet of input symbols, S
is the set of states, s0 2 A an initial state, t S AS the
transition relation and F S the set of accepting states.
FSAs can be used to formally define regular sets of arbitrarily
long symbol strings; given a finite automaton and a
1NetFilter is available at http://www.netfilter
string, it is easy to decide whether the string belongs to the set
represented by the FSA or not. The parallelism with stateless
packet filters is immediate: each packet can be regarded as a
string of bytes and a filter statement defines a set of packets
that must be recognized. In order to adapt the FSA model
to our purposes, it is sufficient to define A as the set of all
the possible 8-bit strings plus the symbol , used to label
transitions that can be taken without consuming any input.
The only remaining requirement is to show that any interesting
set of packets is regular; this is immediately proved
by noting that any finite set is regular and that there are
only finitely many packets because they are limited in length
by technological considerations2. It is therefore theoretically
possible, for any conceivable stateless packet filter, to build
a corresponding FSA. Even recursive structures can be supported:
while in the general case a more powerful formalism
(such as a push-down automaton) is required, restricting the
scope of application to finite sets makes regular automata
sufficient.
B. Protocol database compilation
The first phase in the SPAF generation process consists of
parsing the protocol database and building template automata
that recognize all the correctly-formatted headers for a given
protocol. These automata will be reused and specialized in
later phases to create the final filter.
In order to decouple filter generation from the protocol
database, we have employed an XML-based protocol description
language (NetPDL [7]) designed to describe the on-thewire
structures of network protocols and their encapsulation
relationships. NetPDL descriptions are stored in external files
that can be freely edited without modifying the generator itself.
A precise description of NetPDL is beyond the scope of this
paper; nevertheless we shall provide a quick overview of the
features supported by the FSA generator. The language provides
a large number of primitives that enable the description
of header formats of layer 2 to 7 protocols, but for the scope of
this work we have restricted our support to those designed for
layer 2 to 4 decoding. The basic building block of a protocol
format is the header field, a sequence of bytes or bits that
can be either fixed or variable in size. Adjacent fields are
by default laid out in sequence but more complex structures
such as optional or repeated sections can be created using
conditional choices and loops; these statements are controlled
by expressions that can contain references to the values of
previously-encountered fields.
A second NetPDL portion contains a sequence of control
flow operations (if, switch) that predicate encapsulation
relationships: in general, the control flow is followed until a
nextproto tag is encountered, specifying which is the next
protocol to be found in the packet. A NetPDL database thus
describes an oriented encapsulation graph where the vertexes
are protocols and the edges are encapsulation relationships.
2This is not true in general because FSAs can be used to filter unlimitedlength
symbol strings as those encountered e.g. in payload inspection with
transport session reconstruction. Given the scope of this paper (layer 2 to 4
packet filtering) this is not a concern.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
4
Fig. 1. IPv6 NetPDL excerpt
Currently the graph begins with a single, user-specified root
that usually represents the link layer protocol but an extension
to multiple ones would be trivial. Starting from this root, the
FSA generator follows the encapsulation graph and builds a
FSA for every reachable protocol using the method explained
later in this section.
As an example, a simplified NetPDL description of the
IPv6 header format is presented in Fig. 1. IPv6 starts with
a sequence of fixed-size fields; bitfields (such as ver) are
specified by the mask attribute. The initial portion is followed
by a set of extension headers, each one containing a “next
header” information (nexthdr). This sequence is of unspecified
(but implicitly finite, as any packet is finite) length and it
is described using a switch nested within a loop: at each
iteration the newly-read nexthdr field is evaluated and, if
no more extension headers are present, the loop terminates.
Encapsulation relationships are also specified in a similar
fashion, by jumping to the correct protocol depending on the
value of the last nexthdr encountered.
SPAF currently supports the full versions of the most common
layer 2-4 protocols in use nowadays, such as Ethernet,
MPLS, VLAN, PPPoE, ARP, IPv4, IPv6, TCP, UDP and
ICMP; this set can be easily extended as long as no stateful
capabilities are required.
An important point regarding FSA creation from NetPDL
descriptions is that, as long as it is correctly performed, it
is not a critical task for filter performance: any resulting
automaton ultimately will be determinized and minimised,
yielding a canonical representation of the filter that does not
depend on the generation procedure. For this reason, and
given the complexity involved, the NetPDL to FSA conversion
procedure is not fully described in this paper and it can be
regarded as an implementation detail. Nevertheless, in order
to exemplify how the conversion can be done, we report the
key steps for translating the NetPDL snippets of Fig. 2 into
the corresponding automata.
The purpose of this initial conversion step is not to generate
automata immediately suitable for filtering; on the contrary,
the results are templates for the following generation steps,
representing the “vanilla” version of protocol headers, with no
other conditions imposed, to be specialized according to the
filter rules. Since they are strictly related to header format, any
input-consuming transition in these templates can be related to
a specific portion of one3 header field; this information must
be preserved to accommodate the imposition of filtering rules.
For this reason template automata are augmented by marking
all the relevant transitions with the related field’s name4.
The simplest example is generating an automaton that parses
a fixed-length header field (Fig. 2a): it is sufficient to build
a FSA that skips an appropriate amount of bytes, resulting
in Fig. 2b. During the construction process header fields
are given well-defined start and end5 states that are used as
stitching points to join with any predecessors or successors by
-transitions, as required.
A more complex example involving a conditional choice
is shown in Fig. 2c. The generation procedure starts by
creating automata representations for all the initial fields
in the NetPDL description; upon encountering the switch
construct, however, the generator backtracks the transition
graph until it encounters the type field. Once found, all the
states/transitions that follow type (the A block in the figure)
are replicated. The original copy is left as-is while in the
replica the transitions for type are specialized to recognize
the bytes of interest for the switch, so the right path will be
taken depending on the actual input values. Finally the correct
trailing block (B or C) is joined in the right place.
The last example (Fig. 2e and Fig. 2f) shows the automata
generated for a header structure similar to the IPv6 extension
headers case. In this case a loop is interlocked with a switch
construct and a greater amount of block replication is required
to ensure that independent paths exist into the automaton for
every possible combination of the current nexth value (upon
which the outcome of the switch depends) and the next
nexth value, that might cause the loop to end.
Encapsulation relationships are handled in a similar fashion,
by spawning new paths in the automaton graph that end
with a special state marked with the protocol that should
follow. The exact usage of these marked states is explained
in Section III-D.
The generation procedure acts to counter the absence of
explicit storage locations in the FSA model; when it becomes
necessary to use the values of previously encountered fields
for subsequent computations the only solution is to spawn a
number of parallel branches within the automaton, each one
associated with a specific value of the field under consideration.