[cryptography] Intel RNG
marsh at extendedsubset.com
Tue Jun 19 03:48:16 EDT 2012
On 06/19/2012 01:36 AM, Jon Callas wrote:
> On Jun 18, 2012, at 4:12 PM, Marsh Ray wrote:
>> 150 clocks (Intel's figure) implies 18.75 clocks per byte.
> That's not bad at all.
Right, 500 MB/s of random numbers out to be enough for anybody.
My main point in running the perf numbers was to figure out the
justification for this RNG not being vulnerable to entropy depletion
attacks in shared hosting environments.
Still, 150 clocks is a crazy long time for an instruction that doesn't
involve a cache miss or a TLB flush or the like.
> It's in the neighborhood of what I remember my
> DRBG running at with AES-NI. Faster, but not by a lot. However, I
> will getting the full 16 bytes out of the AES operation and RDRAND is
> doing 64 bits at a time, right?
~150 clocks gets you 64 bits, up to some internal bandwidth limit hit at
around 8 threads.
>> Note that Skein 512 in pure software costs only about 6.25 clocks
>> per byte. Three times faster! If RDRAND were entered in the SHA-3
>> contest, it would rank in the bottom third of the remaining
>> contestants. http://bench.cr.yp.to/results-sha3.html
> As much as it warms my heart to hear you say that, it's not a fair
It's an apples/oranges comparison in some ways - but in some it's not.
For example, skein1.1.pdf says "Skein can be used as a PRNG with the
same security properties as the SP 800-90 PRNGs" and that "it can
produce random data at the same speed that it hashes data."
On the other hand, Skein is pure software, so the hardware has little
> A DRBG has to do a lot of other stuff, too. The DRBG is
> an interesting beast and a subject of a whole differentb
So the CR report says "The DRBG is based on AES in counter mode, per the
NIST SP 800-90A", flipping to this nice overview of
NIST CTR mode DRBGs, it says (p 46) "One encryption per blocksize bit
So something is causing AES-NI to take 300 clocks/block to run this
DRBG. Again, more than 3x slower than the benchmarks I see for the
hardware primitive. My interpretation is that either RdRand is blocking
due to "entropy depletion", there's some internal data pipe bottleneck,
or maybe some of both.
If in reality there's no way RDRAND can ever fail to return 64 bits of
random data, then Intel could document that fact and we could save the
world from yet another untested exceptional code path that only had a
moderate chance of working the first time it's really needed anyway.
More information about the cryptography