08-09-2016, 09:21 AM
1453684486-144.pdf (Size: 433.46 KB / Downloads: 5)
Abstract. Universal hash functions are commonly used primitives for fast and secure message
authentication in the form of Message Authentication Codes (MACs) or Authenticated Encryption
with Associated Data (AEAD) schemes. These schemes are widely used and standardised, the most
well known being McGrew and Viega’s Galois/Counter Mode (GCM). In this paper we identify some
properties of hash functions based on polynomial evaluation that arise from the underlying algebraic
structure. As a result we are able to describe a general forgery attack, of which Saarinen’s cycling
attack from FSE 2012 is a special case. Our attack removes the requirement for long messages and
applies regardless of the field in which the hash function is evaluated. Furthermore we provide a
common description of all published attacks against GCM, by showing that the existing attacks
are the result of these algebraic properties of the polynomial-based hash function. We also greatly
expand the number of known weak GCM keys and show that almost every subset of the keyspace is
a weak key class. Finally, we demonstrate that these algebraic properties and corresponding attacks
are highly relevant to GCM/2+, a variant of GCM designed to increase the efficiency in software.
Introduction
The study of information-theoretic message authentication codes and universal hashing was initiated
by Gilbert et al. [18] and Carter and Wegman [12,13,44,45]. Universal hash functions
can be used to construct message authentication codes in both the information-theoretically
secure and computationally secure settings (see [10,45]). Simmons [39] provides a general summary
of the theory of unconditionally secure message authentication. Bernstein [3,5] provides
a thorough description of the genealogy and more recent literature of unconditionally secure
message authentication, including a description of the contributions of Bierbrauer et al. [6], den
Boer [14], and Taylor [42] to polynomial-based hashing. Bernstein [4] also gives an interesting
overview of the security of universal hash function based MACs in the computationally secure
setting. Shoup [38] describes several methods for realising universal hash function families that
are related to polynomials including the evaluation hash [6,14,42] which is a variant of the division
hash or cryptographic CRC of Krawczyk [25] (itself a variant of Rabin’s fingerprinting
codes [33]).
In this paper, we focus on message authentication codes constructed from universal hash
functions that are realised by polynomial evaluation. These are widely used and standardised;
for examples see [5,15,21,24,26,37]. McGrew and Viega’s Galois/Counter Mode (GCM) [30] is the
most widely deployed polynomial-based scheme. The algorithm is generally assumed to be secure,
with a small number of papers containing attacks against the authentication component via the
universal hash function: Ferguson’s attack against truncated GCM tags [17], demonstrating
that the security of short tags is significantly lower than would be expected; Joux’s ‘forbidden
attack’ [23], illustrating the brittleness of GCM under nonce reuse; Handschuh and Preneel’s
extension to Joux’s attack [20]; and Saarinen’s cycling attacks [36], which highlight a weakness
due to the underlying algebraic structure of a hash function based on polynomial evaluation.
Both Handschuh and Preneel [20] and Saarinen [36] have described classes of weak keys for
polynomial evaluation based universal hash functions, with Saarinen particularly focusing on
GCM.
Contributions. A motivation of this work was the observation that all existing attacks against
GCM are algebraic in nature, and in fact seem to exploit a fundamental underlying algebraic
structure of the polynomial-based hash function. The contributions of this paper are to identify
and study some of the properties of hash functions based on polynomial evaluation that are the
result of this underlying algebraic structure. As a result, we are able to describe a general forgery
attack, of which Saarinen’s cycling attack is a special case; our attack can, however, be used
with short messages, applies regardless of the field in which the hash is evaluated, and facilitates
length extension attacks against GCM. Furthermore, we provide a common description of all
published attacks against GCM by showing that the existing attacks are the result of these
algebraic properties of the polynomial-based hash function. We also greatly expand the number
of known weak GCM keys, and show that almost every subset of the keyspace is a weak key
class. Finally, we demonstrate that these algebraic properties and corresponding attacks are
highly relevant to GCM/2+, a variant of GCM designed to increase the efficiency in software.
We note that the attacks presented in this paper do not in any way contradict the security
bounds for GCM given by McGrew and Viega [31]. However, the algebraic properties (and
related attacks) discussed in this paper appear to be an inherent feature of polynomial-based
authentication schemes and therefore should be considered in the security assessment of new
schemes and extensions of existing ones. Additionally, we consider the consequences of a related
property on another polynomial based scheme in Section 9.
A preliminary version of this paper was presented at Fast Software Encryption 2013. This
paper includes additional explanation of our results and further details on their relationship to
the previously known results; Section 9 has been added and we extend our results on weak keys
to include all two element subsets of the keyspace in a specific case.
Structure. This paper is structured as follows. In Section 2 we introduce the notation that
will be used throughout this paper and provide a brief description of the syntax and security
of message authentication codes. In Section 3 we give a basic overview of four schemes that use
hash functions based on polynomial evaluation for message authentication, including GCM and
SGCM. In Section 4, we briefly describe the existing results on the security of polynomial-based
MACs. In Section 5 we describe the main technique used in this paper for the cryptanalysis of
polynomial-based authentication schemes and discuss some features of the resulting attack that
make it more interesting than cycling attacks. Section 6 explains the relationship between the
properties described in this paper and the known results on the security of polynomial-based
MACs. In Section 7 we show that there are many more weak key classes for hash functions based
on polynomial evaluation than have previously been described and suggest a method to realise
a key recovery attack against polynomial-based hash function schemes. In Section 8 we apply
the attacks described in this paper to GCM/2+. In 9 we identify similar techniques that can
be applied to Square Hash, another universal hash function family based on a different form of
polynomial evaluation. Section 10 contains a discussion of the consequences of this paper.
2 Preliminaries
2.1 Notation
We consider a message M parsed as M1
| . . . |Mm, where each Mi
is n bits long and | represents
concatenation of strings. In the syntax of authenticated encryption with associated data [34],
this message consists of associated data A ∈ A that is authenticated but not encrypted and
plaintext P ∈ P that will be encrypted and authenticated.
A family of hash functions will be denoted H = {hH : {0, 1}
? → {0, 1}
n
| H ∈ KH} with each
hash function hH indexed by a key H ∈ KH. A block cipher E is a family of permutations on
{0, 1}
n
, with each permutation indexed by a key k ∈ KE. The application of a block cipher to
input x ∈ {0, 1}
n using key k will be denoted by Ek(x). A nonce will be denoted by N.
A finite field will be denoted by K unless the order of the field has particular relevance, in
which case it will be denoted by Fp
r with |Fp
r | = p
r
. The multiplicative group of a field K will
be denoted by K?
.
2.2 Universal hash functions
A family of hash functions is said to be –almost XOR universal if for every M, M0 ∈ {0, 1}
?
with M 6= M0 and for every c ∈ {0, 1}
n
, PrH∈KH [hH(M) ⊕ hH(M0
) = c] < . Throughout this
paper –almost XOR universal will be abbreviated to –AXU. This condition was introduced
by Krawczyk [25] under the name –OTP–Secure as it is a necessary and sufficient condition
for unconditional MAC security when the output of the hash function is encrypted with the
one time pad in a field of characteristic 2. In this paper we will generally refer to –AXU hash
function families; however any remark made that requires an –AXU hash function family in
characteristic 2 will also hold for an –almost strongly universal [41] or –almost ∆ universal
[40] hash function family in any finite field.
A polynomial based hash function family is a common way to realise an –AXU hash function
family. Shoup [38] describes several examples of this type of construction; the main example of
interest to this paper is the evaluation hash. In the case of the evaluation hash the message M
determines a polynomial gM =
Pm
i=1 Mix
i ∈ K[x], where M = M1
| . . . |Mm with each Mi ∈ K.
The hash key is an element H ∈ K and we define the hash function by hH(M) = gM(H).
There are several methods for turning a universal hash function into a message authentication
code (see [10,45] for early examples). The two most common methods are Ek(N) + hH(M) and
Ek(hH(M)).
2.3 Syntax
We will follow Black et al. [7] for a description of the syntax of nonce-based message authentication
schemes. A message authentication scheme is a pair of algorithms, Gen and MAC, with
four associated sets: K, the set of possible keys; M, the message space; N , the set of nonces and
T , the set of possible authentication tags.
The key generation algorithm Gen takes as input the security parameter and probabilistically
outputs the shared key k ∈ K. The algorithm MAC takes as input a key k ∈ K, a nonce N ∈ N ,
and a message M ∈ M and outputs a tag T ∈ T . The authenticity of a tuple (N, M, T) is verified
by computing MAC(k, N, M): if T = MAC(k, N, M) then the tag is valid, otherwise it is invalid.
2.4 Security
An adversary attacking a message authentication scheme is given access to two oracles: a tag
generation oracle S and a verification oracle V. At the beginning of the experiment Gen is run to
obtain k, then S takes queries (N, M) and returns MAC(k, N, M). The verification oracle takes
queries (N, M, T) and returns 1 if T = MAC(k, N, M) or 0 otherwise. An adversary is said to
successfully forge an authentication tag if they can produce a verification query (N, M, T) so
that V returns 1 when (N, M) was not previously queried to S.
A common restriction of this security notion is to nonce-respecting adversaries where, although
the adversary can control the nonce, they never query S for (N, M0
) if they have previously
queried S for (N, M).
For polynomial-based MACs, McGrew and Viega [31], Ferguson [17], and Handschuh and
Preneel [20] all assert that the probability of creating a valid (non-truncated) tag having seen a
single valid (message, tag) pair is approximately m/|K| where the polynomial is evaluated in K
and m is the length of message that the construction operates on. It is worth emphasising that
in this context, m is the maximum permissible message length. This is included in the original
paper [31] but is not made explicitly clear in the later papers [17,20]. In this paper we will
demonstrate the importance of this distinction via a method of forging GCM tags using a longer
message than the one that was given in the valid (message, tag) pair from the tag generation
oracle.
Throughout this paper we will focus on GCM for concreteness however the majority of the
comments apply equally to any other hash function based on polynomial evaluation. Most of the
results in this paper apply equally to both common constructions of MACs from universal hash
functions, either T = Ek(N) + hH(M) or T = Ek(hH(M)), as our results are based on collisions
in the hash function. Where necessary it will be made clear that a remark is dependent on one
of these general constructions or the specific structure of GCM.
2.5 Weak Keys
For any cryptographic algorithm, a relevant question for its security assessment is whether it
contains weak keys. Handschuh and Preneel [20, Sect. 3.1] give the following definition of weak
keys:
In symmetric cryptology, a class of keys [D] is called a weak key class if for the members of
that class the algorithm behaves in an unexpected way and if it is easy to detect whether
a particular unknown key belongs to this class. For a MAC algorithm, the unexpected
behavior can be that the forgery probability for this key is substantially larger than
average. Moreover, if a weak key class [D] is of size C, one requires that identifying that
a key belongs to this class requires testing fewer than C keys by exhaustive search and
fewer than C verification queries.
Handschuh and Preneel [20] and Saarinen [36] have identified weak key classes for GCM and
other polynomial-based hashes; we discuss and extend these classes in Section 7.
3 Polynomial-based Authentication Schemes
We present below a brief description of some of the main authentication schemes based on
polynomial evaluation hash functions that are of relevance to our work. We will also discuss
Square Hash in Section 9; Square Hash is another family of universal hash functions also based
on polynomials.
3.1 Galois/Counter Mode
Galois/Counter Mode (GCM) is an AEAD scheme submitted to NIST by McGrew and Viega
in 2004, with the specification slightly revised in 2005 [30] (although the revision contained ‘no
normative changes [from the 2004 specification]’). GCM combines counter mode encryption with
a polynomial evaluation based MAC following the Encrypt–then–MAC paradigm, although the
authentication key is derived from the block cipher key.