[cryptography] "folded" SHA1 vs HMAC for entropy extraction

Thor Lancelot Simon tls at panix.com
Wed Jan 4 12:47:36 EST 2012


I'm working on the entropy-pool code in NetBSD, which began its life
many years ago as a simplified implementation of the same ideas behind
the Linux /dev/random implementation.

The NetBSD implementation now keys a stream generator from the pool
rather than directly outputting pool bits, but the underlying entropy
gathering and mixing model is basically the same -- a very large LFSR
using SHA1 to "distill" entropy on output.

Eventually I will replace it with a multi-pool implementation like
Fortuna.  However, I'm trying to make incremental improvements while
waiting for that mythical great extent of free time to appear.

One thing that's always bothered me has been the use of an odd
"folded" SHA1 construct to generate output bits.  What is done is
this:
	The pool is 4096 bits long.  It is hashed with SHA1, producing S1.

	S1 is split in half, the first 80 bits and last 80 bits,
		producing H1 and H2.

	H1 xor H2 is computed to produce R, which is returned to the caller.

	S1 is mixed back into the entropy pool as input.

The Linux code had a weakness in this area which is described by
Gutterman, Pinkas, and Reinman's paper from 2006.  I don't believe the
NetBSD code has this problem.  However, while looking at it I have
been wondering why something simpler and better analyzed than the "folded"
SHA should not be used.

In particular, I do not see why HMAC with different, known keys should not
be used:

	The pool is 4096 bits long.  HMAC(K1, pool) is computed,
		producing R, which is returned to the caller.

	HMAC(K2, pool) is computed and mixed back into the entropy
		pool as input.

I would appreciate comments on this general idea.

-- 
Thor Lancelot Simon	                               tls at panix.com
  "All of my opinions are consistent, but I cannot present them all
   at once."	-Jean-Jacques Rousseau, On The Social Contract



More information about the cryptography mailing list