[cryptography] Weak random data XOR good enough random data = better random data?

Natanael natanael.l at gmail.com
Mon Jul 28 14:41:31 EDT 2014

Den 28 jul 2014 18:23 skrev "Lodewijk andré de la porte" <l at odewijk.nl>:
> Hey everyone,
> If I XOR probably random data with good enough random data, does that
result in at least good enough random data?
> I'm working on some Javascript client side crypto. There's a
cryptographic quality random generator present in modern browsers, but not
in older ones. I also don't trust browsers' random generators' quality.
> I'd like to ship a few KB (enough) of random data and XOR it with
whatever the best-available RNG comes up with. That way the user can still
verify that I didn't mess with the randomness, no MITM attacks can mess
with the randomness, but given a good transport layer I can still
supplement usually bad randomness.
> I don't see how it could reduce the randomness to XOR with patterned
data. If someone knows better of this, let me know. If I'm correct that
also means it should be okay to reuse the few KB's should they ever run out
(in this system), at worst it no longer improves the randomness. I don't
expect that to ever happen, and I'd prefer requesting new KB's, but it's
still interesting.
> Could someone confirm this whole thought-train for me? That means, is it
a good idea to (over HTTPS) send some randomness*, XOR it with the
best-available RNG for better randomness? I actually feel pretty confident
about it, just asking for (a few) second opinion(s).

With a mixer function that isn't shitty and lossy/non-reversable, you are
unlikely to lose entropy.

With XOR specifically, then AS LONG AS THE INPUTS ARE UNCORRELATED you will
not lose entropy (you always has at least as much entropy as your most
entropic input). XOR practically anything with random and you still have
random. But if you bitflip the XOR key and XOR that string with the key,
you get all 0's. Not good. There are tons of methods by which they can be
correlated that leak data.
If you know why/how one time pads are mathematically proven uncrackable,
you should know why this is true.

Also, because of how XOR works, you need to make sure at least one of the
inputs have high entropy density if you need to use the output for
something that needs good randomness. Two books of random sentences have
high entropy, but low entropy density (on average one bit of entropy per
character, while each character takes about 7 or 8 bits of space to
represent on the hard drive). Taking 256 bits of XOR'ed random words as a
key for something is NOT good enough. This is when you need strong mixing
like hashing or encryption of the entire input, so that each bit in the
output represents close enough to a full bit of entropy.

Any reversible method can not lose entropy. Otherwise you have found a
magical compression function. Practically any encryption algorithm (all
reversible) that is considered secure can be used to mix your inputs, and
you will not lose entropy (even insecure encryption algorithms also
shouldn't cause entropy loss, unless what you used as a key held most of
the entropy and the encryption algorithm fails to mix it properly with the
plaintext - in which case your plaintext still sets the minimum entropy of
the output).

Most good hashing algorithms will also preserve most of the entropy (for
equivalent length inputs, you can't preserve 500 bits of entropy in a 100
bit hash). If you need 200 bits of entropy, it should be safe to use a
modem hash function with a 256 bit strength to hash all of your inputs.

One example: RC4 produces a biased key stream that you can recover the
original key from if you can get a pure enough version with enough bits of
the raw key stream. Use RC4 to encrypt standard ASCII plaintext (the
literal version) and the attacker can use statistical attacks to remove
enough of the plaintext from the ciphertext to get parts of the key stream,
then the key can be cracked to decrypt everything.
However, that attack fails if you only encrypt high entropy plaintexts as
they then can't be separated from the key stream (same argument for why as
the one time pad security). Also, there's no known easy of cracking
ChaCha20's key stream, it looks random (no known detectable bias) and the
key can't be recovered, so it would resist the above attack.

Using a well defined existing mixing function with your own predefined
random data, like a salt, is fine. It doesn't make it weaker. Even XOR'ing
it in will never make it weaker.

But however, if it is publicly known shared randomness then it only
prevents cross-service rainbow tables and similar batch bruteforce attacks
(adds work to crack them all linearly with the number of services rather
than linearly with number of total users (assuming reused randomness)). It
doesn't help against targeted attacks.

Note that a MITM that can replace that randomness you send out imply that
they can inject malicious code as well, in which case the security is
non-existent. Then they can explicitly attack the RNG to be bad, or
backdoor everything.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.randombit.net/pipermail/cryptography/attachments/20140728/d7f8d02b/attachment-0001.html>

More information about the cryptography mailing list