[cryptography] Master Password

Marsh Ray marsh at extendedsubset.com
Fri Jun 1 02:44:43 EDT 2012


On 05/31/2012 04:08 PM, Nico Williams wrote:
> On Thu, May 31, 2012 at 2:03 PM, Marsh Ray<marsh at extendedsubset.com>  wrote:
>> On 05/31/2012 11:28 AM, Nico Williams wrote:
>>> Yes, but note that one could address that with some assumptions, and
>>> with some techniques that one would reject when making a better hash
>>> -- the point is to be slow,
>
>> [...]
>> Starting with:
>>
>> Ep = password entropy in bits (chosen by the user)
>>
>> N =  iteration count (chosen by the defender)
>
> For memory-hard PBKDFs you also need a memory size parameter, though
> you might derive that from N.

That could be useful.

We could also model the memory access latency and bandwidth of the 
defender's servers and the number of CPU cores trying to do other useful 
stuff over the same main memory bus and so on. I figured I'd quit while 
I was ahead. :-) The defender can think about his own systems in as much 
detail as he wishes, but biggest unknowns in the equations are still the 
capabilities of his future attackers.

Another big thing I left out is the cost of time to the attacker. Does 
he have a deadline? What is the value over time of a user's password. We 
used to think of password change policies as putting a hard expiration 
limit on the value of this data, but now we know that users typically 
share passwords across systems and just put a counter digit at the end 
to share them across time. Also, a password often represents the only 
entropy securing certain Wifi and VPN protocols so the attacker may have 
obtained a packet capture that will remain useful to decrypt long into 
the future.

>> * If individual users can be shown to present a different Ep to the
>> attacker, it could be beneficial to adjust N independently per user. For
>> example a website might say to a user: "Want to log in 2 seconds faster next
>> time? Pick a longer password!"
>
> Nice.  But again, in practice this won't work.  You might think that
> for a single master password users could get into the 48 bits of
> entropy range, but probably not.

Well what about this (just thinking out loud):

Current password change policies can only occasionally recover the 
former level of security after a silent breach, but they make it harder 
for users to remember entropic passwords all the time.

What if instead of asking users to throw out their old password and 
choose a brand new completely independent strong one every 90 days we 
instead asked users to append 12 more characters to their old password? 
This way they could get better at remembering longer sequences 
incrementally with more practice over time.

Of course once it becomes too long (seems like a good problem to have) 
we could start dropping the same number of characters from the front. 
Could we in just three password change periods have users on rolling 36 
character passwords?

OK so maybe we don't get the same catastrophic fully reseeding property 
of a traditional password replacement policy, but if the string the user 
appends is subjected to the same length and complexity requirements in 
place now then it seems like a rolling long password could be no worse.

>>> Can PBKDF2' be weaker than PBKDF2?
>>
>> Yes.
>
> Mine or any PBKDF2' in principle?

In principle.

> I don't see how my PBKDF2' construction ends up being more
> parallelizable than PBKDF2 such that PBKDF2' can run faster than
> PBKDF2 -- you need to compute PBKDF2 before you can compute PBKDF2'
> and I see no way around that.  If memory_hard() is not nearly as
> memory-hard (and/or slow) as initially thought then PBKDF2' will not
> help much, but it should still not hinder either.

Say memory_hard() was designed with DES as the fundamental primitive 
(for slowness of course) back when servers could only spare a 128 MiB of 
memory at a time.

Today's attacker would be able to parallelize the DES with a bitslice 
implementation on a video card with 4 GB RAM and GPU that can hide 
random memory latencies with single-clock context switching between 64 
threads.

The defender cannot benefit from that parallelism, he has to hash the 
passwords one at a time as users authenticate.

So it doesn't matter how high you turn up parameter N, or even to a 
degree M; it's not (just) a question of scaling parameters to track 
Moore's law. It's that a hash function that looked as-hard-as-practical 
a few years ago turns out to be more a bit more efficient in parallel on 
the attacker's hardware than in serial on the defenders'.

Another example:

On 05/31/2012 10:43 AM, Adam Back wrote:
> That
> particular one has a disadvantage that it requires a few megs of pre-baked
> numbers to be shipped with the library.

A few MB of static constants sounds to me like a resource that may be 
shareable in the attacker's parallel synchronous implementation, but 
maybe even demand-paged in the defender's system.

> Unless... unless N
> is tuned down on the theory that memory_hard() adds strength -- if it
> turns out not to then having turned N down will greatly help the
> attacker.


>> It could also be that memory_hard(P, S, p) introduces
>> impractical-to-mitigate timing or other side channel attacks that leaked p.
>
> Sure.  But let's assume that memory_hard() isn't awful.  It just isn't
> yet accepted, and it might have some weaknesses.  I'd hope that by now
> everyone pays attention to timing side channels (but power side
> channels? not so much).

No kidding. Did you see http://www.cl.cam.ac.uk/~sps32/qvl_proj.html ?

> We can test for timing leaks.  We can't test
> for as-yet undiscovered optimizations.

You've proposed a nice little construction with which to build memory 
hard from components, some of which are well-understood and some of 
which are new. You haven't accompanied it with a mathematical proof of 
security and even if you had many of us would still be skeptical.

It's because the value of memory_hard() is built on a lot of assumptions 
about future hardware availability and the framework for analyzing it's 
security benefit seems really quite new (if it's ever been formally 
described yet at all).

So it comes down to a lot of specifics that will have to be stared at 
and analyzed by a lot of smart (and probably busy) people for a long 
time before any such algorithm could be given the same level of official 
blessing as plain PBDKF2.

> Sure, but I think you're making my point: there are things we could do
> to make a PBKDF2' that is at least as strong as PBKDF2.

I think you're right about that.

> As long as we
> don't respond to PBKDF2' by tuning down N I think that's possible,
> trivially possible.  But the risk is probably quite high that N will
> get turned down ("it used to take 1s to login, now it takes 2s!"->tune
> down N so it takes 1s with the new PBKDF2').

Maybe that's even the default assumption? That's what we get if we think 
of users having a fixed tolerance for authentication latency. As M goes 
up, N must go down.

But how does (M, N) get chosen in practice? This is probably the hardest 
to quantify variable on the defender's side. I just put it down as a 
cost to the defender that's some nonlinear function of N.

I can only relate my own experience. Not having worked with a wide 
variety of high-scale websites in my career as a software developer I 
may not be the best person to describe Dd(N). Nevertheless, here's the 
evolution of my relationship with the cost of N. (Names and dates 
omitted to protect the guilty.)

* I read the Bcrypt paper when reading about the design choices in 
OpenBSD. This is probably the first time I heard about work factor.

* I'm a software developer on a product and see a place in the design 
where an additional work factor would be beneficial. Based on the 
reaction I got when I introduced the lead developers to the concept of a 
"state machine", I keep my mouth shut and do not propose any more 
radical ideas. (This company had password security already figured out 
anyway. They simply mandated that all passwords be all upper case 
because "that's the last thing password cracking programs check for".)

* I suggest we include some degree of extra computational load in a 
credential validation function. The boss thinks I am joking, does the 
Amadeus laugh, and walks off to a meeting.

* I tell the head operations person of a website we are developing that 
we really ought to include a work factor into the credentials 
validation. He says flatly "No way. There is no way we are doing any 
extra work on our web servers."

* The next time I need to implement a credentials validation routine, I 
just go ahead and hard code the largest work factor I think I can get 
away with. I choose N such that it produces a brief but noticeable delay 
on my development box, probably a few hundred ms. This results in an N 
that is quite generous by published practices. No one ever notices or 
complains about the CPU cost.

(Again this covers many years and several employers and some details are 
synthesized, etc.)

I'm sharing my story not to try to hold myself up as some kind of 
pioneering software developer in a world of laggards, just to say here's 
how I've seen N get chosen in practice. I imagine many companies will 
take great delight in using their formal process to convene a committee 
of stakeholders to select N. Others will choose the minimum value 
required by their auditors and industry standards. Others will choose 
N=1 (yes this happens). Others will test different values systematically 
to select the largest N their systems can tolerate acceptably.

Regardless, the process by which N is chosen seems almost as much of a 
security factor as N itself.

- Marsh



More information about the cryptography mailing list