[cryptography] Current state of brute-forcing random keys?
marsh at extendedsubset.com
Thu Jun 9 18:49:38 EDT 2011
On 06/09/2011 04:53 PM, Solar Designer wrote:
> On Thu, Jun 09, 2011 at 04:22:16PM -0500, Marsh Ray wrote:
>> Which neatly fits 2^32 trials per watt-second. A real engineer would
>> probably design the chips to minimize energy-per-trial, but I think our
>> estimate is probably still within an order of magnitude or two.
>> Last I checked, in the US electric power is around $0.07 per kWh in
>> areas considered "cheap". So each trial costs $4.53e???18 in electric power.
>> For an 64-bit key, you expect the adversary to need 2^63 trials for
>> which he might pay a power bill of $597.
> That's scary. Even more so if you actually multiply by $0.07, which
> gives $42.
Hmm, I thought I had included that.
I left my trusty old HP RPN calculator at home today and while a good
craftsman never blames his tools, apparently I'm not good enough to get
the same answer twice out of Ubuntu's. So double-checking my numbers and
suggestions on an RPN calculator app for Linux are welcome.
> This means that for relatively small key size like this, the primary
> cost for using an ASIC implementation is in creating that implementation
> and in making it solve the specific task, not in electricity.
... and we all know folks who would do that sort of thing just for the
fun learning experience.
But what if the attacker doesn't want to pay the $100K minimum foundry
cost? How much of a penalty does he pay for using FPGAs, or even a
heavily optimized CPU+GPU codebase? Perhaps it's a huge hit, but even a
100x loss amounts to less than 7 bits of key length.
> It also means that a good KDF should not be limited to using iterations,
> but should also consume much die area - like scrypt does (requiring RAM)
> or/and through lots of parallelism (in one KDF instance). If
> implemented in software (on CPU or GPU), it should more fully use the
> resources of that computer system (CPU/GPU execution units, RAM).
I think Scrypt has the right idea with consuming RAM or, more precisely
RAM nonlocal latency. CPU frequencies have gone from MHz to GHz, memory
has gone from MB to GB, even bus speeds have increased a few times, but
the time required to fetch a cold address from DRAM has changed least of
I suspect an ideal work-factor function would thus look something like
RC4 (which is actually well defined over any state size) but with a few
GB of state rather than the 256 bytes that it's normally used with. Due
to some small patterns in the RC4 keystream it would still use a proper
hash function for distilling the final output hash value.
A few GB of state would hopefully put it in that size range where it's
too large to fit on any attacker's ASIC, while at the same time being of
a size for which a defender's commodity hardware is heavily optimized.
More information about the cryptography