[cryptography] Implementing constant-time string comparison

D. J. Bernstein djb at cr.yp.to
Wed Jun 18 17:18:17 EDT 2014


John Gilmore writes, on a semi-moderated mailing list:
> A bugfree C compiler

Bwahahaha. That's funny.

A large part of the game here is to envision the screwups that people
will make and build systems that survive those screwups. For example,
it's common to have C code such as

   x ? MACRO_A : MACRO_B

where the two macros happen, in this compilation, to be the same:

   x ? "hello" : "hello"

So it's perfectly reasonable to see a C compiler producing just

   "hello"

by folding the code across the branch---the compiler writer saw this
pattern often enough and realized that it wouldn't be hard to optimize.
Of course, this optimization is valid only if the two branch results are
identical compile-time constants (after constant propagation etc.), so
the compiler writer does a compile-time comparison of those constants:
== for int, strcmp() for strings, etc.

What I've just described isn't exactly right---it's a compiler bug, a
bug that really occurred in gcc some years ago. Did you notice the bug
as I was describing it? Would you have produced test cases that catch
the bug?

> would be unable to shortcut the loop if the
> arguments were merely declared as pointers to volatile storage

The compiler would be required to access the storage but would still be
allowed to skip the intermediate calculations. For example, it could
convert

   int result = 0;
   int iszero;
   for (i = 0;i < n;++i) result |= (x[i] ^ y[i]);
   iszero = (result == 0);
   return iszero - 1;

into

   int iszero = 1;
   for (i = 0;i < n;++i) if (x[i] ^ y[i]) iszero = 0;
   return iszero - 1;

or into

   int iszero = 1;
   for (i = 0;i < n;++i) if (x[i] != y[i]) iszero = 0;
   return iszero - 1;

or into

   for (i = 0;i < n;++i) if (x[i] != y[i]) goto shortcut;
   return 0;
   shortcut: while (++i < n) { x[i]; y[i]; }
   return -1;

without violating the C language specification. You're deluding yourself
if you think that the guarantees made by the C specification are
adequate for writing constant-time code.

What's the chance of a compiler screwing things up in this way? This
isn't a question of language lawyering; it's a question of what the
compiler writer is thinking. Has the compiler writer seen examples where
it might seem useful to replace

   result = 0; result |= ...; result |= ...; result == 0

with

   iszero = 1; if (...) iszero = 0; if (...) iszero = 0; iszero

which would then hook nicely into early exits? Sure, the early exits
should check for volatile memory accesses in the skipped calculations,
but this doesn't mean that the replacement has to check for volatile.

NaCl has no data flow from secret data to array indices, and no data
flow from secret data to branch conditions---including "==" inputs. In
particular, to screw up the crypto_verify() timing, the compiler writer
would have to do two things wrong: not just the type of replacement
described above, but also converting NaCl's arithmetic to branching in
the first place. This isn't a common type of arithmetic, and so far we
haven't heard any reports of compilers screwing it up.

Of course, to properly assign blame for screwups, and maybe to prevent
the screwups from happening in the first place, it would be useful to
have the code written in a language whose semantics _do_ include timing.
To some extent NaCl is written in asm, and to some extent asm semantics
do include timing, although the situation could be improved:

   http://blog.cr.yp.to/20140517-insns.html

---Dan


More information about the cryptography mailing list