[cryptography] Enranda: 4MB/s Userspace TRNG
pkejjy at gmail.com
Tue May 26 21:12:47 EDT 2015
"if your proposed method comes with a complex extractor, it is bullshit"
OK point well taken. I should offer a raw mode. ("make timedeltasave" then
run "temp/timedeltasave" for the help text) already does this, but I should
build it into Enranda directly. Obviously, the timedelta stream is
extremely sparse in entropy, but yes, anything can be encrypted. (timedelta
= timestamp(t1) - timestamp(t0)).
"if your method comes without a detailed analysis and measurements on the
entropy content of the raw data, it is bullshit"
I agree. So first of all, essentially all the periodic content is removed
from the timedelta stream, which in the worst case comes from a tightloop
read of the timestamp:
What this leaves behind is the aperiodic residue. Or more specifically,
((the hashes (of all sequences)) that have not been seen in the last 2^16
such hashes). I realize that this isn't hard proof (as nothing in physical
hardware can be proven), but if you run timedeltasave, you can see for
yourself just how unpredictable the aperiodic hash residue of the timedelta
stream is, even if you capture it in a tight loop.
I can provide details on how to observe this directly from the source, if
anyone wants it.
The protoentropy content (which is provably less rich in actual entropy
than the timedelta stream itself) is measured in units of unique sequence
hashes, where "unique" essentially means "I haven't seen this hash in at
least the last (2^16) sequence hashes". The hashes are thus valued at 16
bits each. But we don't use them directly because they are ever-so-slightly
biased. Instead, we throw them into another 16-bit hash, which is an
order-sensitive hash-of-unique-sequence-hashes called history_hash. This
the "real" protoentropy which is used to permute the trapdoor. This is all
"for start, where your entropy is coming from?"
IRQs are actually a negligible contribution; this is a fundamental
divergence from Jytter. Just like most of the information in a photo is
contained in the pixel noise, not the subject of the photo itself, so too
is the case with the timedelta stream. It's the tiny timing fluctuations
due to cache misses, pipeline stalls, CPU circuit clock gating, etc. that
provide the majority of the protoentropy. If this sounds unbelievable, then
do a timedeltasave, then 7zip it. It's quite repeat-y, yet much richer than
you might expect. For example, I used:
temp/timedeltasave 1 16 temp/timedelta16.bin
xxd -i -c8 temp/timedelta16.bin | head
Even if you have 2 Enrandas (or an Enranda and a process attempting to spy
on it) running on the same core (say, on each of the hyperthreaded X86
peers on the same physical die), you should get radically different
aperiodic residues -- even if you translate or rebase them optimally in
order to attempt to correlate them. If you don't, there's a bug somewhere!
"otherwise the CPU runs quite predictably."
Absolutely true. Hence the antiperiodic mechanism described above. To be
clear, it ignores periodic timedelta sequences (_almost_ completely); it
does _not_ keep them and try to hide them behind whitening bullshit.
"it is already fishy to say that you can collect 4Mbit/s from IRQ alone."
Yeah, IRQs are quite poor in entropy, actually, even poorer than most
people realize (on the order of the square root of their intra-service
time). But as I mentioned above, the timedelta stream contains all kinds of
microscale entropy, which provide for the stated bandwidth. Even a
simulation of the CPU is quite useless due to the unpredictability of the
state of the electronics at runtime.
"also it is very different on different platforms."
IRQs, yes. Microscale entropy, not so much. I've seen everything from 5 to
8 MB/s on commodity X86/X64 platforms, hence my claim of 4.
"embedded systems without user interaction tend to have less IRQ noise."
Either way, I don't need any IRQs at all.
"where are the estimates? where are the calculations?"
Obviously this is a question of paramount importance, so I'm not going to
blow it off. But I'm forced to answer it indirectly: Enranda throws away
absolutely everything that it considers to be predictable. The rest is
unpredictable (to Enranda). So in other words, it's attempting to do the
entropy calculations itself (and conservatively so) by analyzing
periodicity in the microscale timedelta stream as it happens in situ, in
real time. Why do this? Why not start with a grand theory of X86 entropy?
Because this cannot possibly be created in 2015, and maybe not even in
1995. Networked systems (and even nonnetworked ones, frankly) are simply
too complicated to model. So it must be calculated in realtime using the
limits of what memory space and practicality can provide, by way of history
recording; hashes enhance this capacity, in effect.
As to the mathematics behind the measurements, it's all spelled out in
and the numbers are here (buried in lots of demo-y stuff):
Granted, those "measurements" were performed postwhitening. Had I performed
them prewhitening (i.e. on the protoentropy itself), the entropy density
(_not_ entropy _content_) would of course have been lower (but not by 50%,
which is what the trapdoor process implicitly assumes). So honestly, I
think the timedelta stream is worth more like 10MB/s at full tilt (but I
won't go there).
Now that you mention it, a few incarnations ago, Enranda actually had a
builtin realtime logfreedom (
measurement engine. That is, it would measure the logfreedom of the
aperiodic timedelta sequence hash residue in real time. But I dumped it
when I realized it was unnecessary, because a unit of such residue was
almost always worth well more than 16 bits, so I could just cut it off at
16, and be at least as safe.
On Tue, May 26, 2015 at 10:25 PM, Krisztián Pintér <pinterkr at gmail.com>
> i call bullshit on this one, just as i called bullshit on havege. a
> proper hwrng always outputs the raw, unfiltered random bits. and an
> estimate of the the entropy content. whitening is easy, and can be
> done various ways, it is not interesting. many times we don't even
> want whitening, because we already have an entropy accumulator
> arrangement, like linux /dev/random (whatever crap it is).
> 1, if your proposed method comes with a complex extractor, it is
> 2, if your method comes without a detailed analysis and measurements
> on the entropy content of the raw data, it is bullshit
> for start, where your entropy is coming from? it all comes from IRQ-s,
> otherwise the CPU runs quite predictably. it is already fishy to say
> that you can collect 4Mbit/s from IRQ alone. also it is very different
> on different platforms. embedded systems without user interaction tend
> to have less IRQ noise. where are the estimates? where are the
> > Russell Leidich (at Tuesday, May 26, 2015, 5:01:20 AM):
> > Enranda is a cryptographically secure (in the postquantum sense)
> > true random number generator requiring nothing but a timer (ideally,
> > the CPU timestamp counter).
> > http://enranda.blogspot.com
> cryptography mailing list
> cryptography at randombit.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cryptography