[cryptography] Explaining crypto to engineers (was: Duplicate primes in lots of RSA moduli)

Kevin W. Wall kevin.w.wall at gmail.com
Sat Feb 25 22:47:14 EST 2012


Ondrej,

Thanks for your well thought out response. Some more comments below.

-kevin

On Sat, Feb 25, 2012 at 2:22 PM, Ondrej Mikle <ondrej.mikle at nic.cz> wrote:
> Hi,
>
> here is an attempt to summarize view of crypto from engineers' point of view.
> It's based on discussing the points raised in the "Duplicate primes..." thread
> with couple of HW/SW engineers and past experience with colleagues.
>
> Sorry for the length, this post grew quite a bit.

Well, it is a summary after all, otherwise I'd expect it to be much
longer. Oh, wait, that's not what you meant, is it? No problem.
Better too much than not enough.

> Hopefully I caught the main common points on cryptographer-engineer axis that
> cause crypto to get broken. (I have mostly SW engineering background with bits
> of HW engineering and crypto that stuck along the way.)
>
> Suggestions welcome for making it easier to explain to engineers, please
> correct my misconceptions/mistakes when you see them. This could be a good
> chance to do a first step in avoiding disasters like the shared primes.
>
>
> Due to length, here's a short summary of engineer's view (expanded on below):

My former boss used to tell me, Kevin, you know you are in trouble
when your summary needs a summary. But I can relate. He used to
say that asking me a question was like trying to drink from a fire
hose.

> ----- short summary of typical SW/HW engineer's view on crypto and bugs -----
>
> A) "I don't understand why are you interested in crypto, you put some number in
> and get some number out, never really sure if there's an error in the program.
> In 3D graphics or physics simulation I see the bug almost instantly."
>
> (A friend's more than decade old quote, but still quite a spot on explanation
> why crypto bugs go often unnoticed for long time.)

So true.

> B) "cryptography, n.": about 15-20 general principles and plethora of special
> cases, whose count maybe could be less than Graham's number.
>
> C) things that cause bugs: cooperation paradox, pushing engineer to code/build
> insanely fast (almost a guarantee of security bug) or changing specification
> often (mostly covered in previous messages in thread)
>
> D) implementation of crypto-protocol is hard: engineer must understand both the
> protocol and the platform where it will run on, platform's quirks and quirks of
> other endpoints' implementations the communication will lead to. High
> probability in introducing side-channels.
>
> E) often there is no one with sufficient experience/expertise to ask (the
> mentoring part mentioned in this thread)
>
> F) understanding randomness/entropy is very hard, implementing correctly
> extremely hard
>
> ----- end short summary -----
>
>
> Some answers/opinions on Kevin's points, couple of suggestions and links that
> could help engineers understand crypto pitfalls:
>
> On 02/22/2012 03:31 AM, Kevin W. Wall wrote:
>> From where I sit, I see the following things that the development
>> community in general are lacking when it comes to things crypto:
>>
>> 1) They think that key size is the paramount thing; the bigger the better.
>
> I'm not sure this is really that common (except really newbie crypto enthusiasts).

Well, I wish it were, but I've seen several evidences of this. For example:
1) In ESAPI 2.0RC2 (release candidate 2), they security gurus insisted on
   making the default crypto 256-bit AES, but in ECB mode.
2) At work, the Information Security team's "Encryption Technical Standards"
   only mentions which algorithms are acceptable and what the minimim key
   size is (128-bits) for symmetric block ciphers. It makes NO mention of
   cipher modes, padding, choosing IVs, or anything else.
3) Later when my company was acquired, the new company had a standard of a
   minimum key size of 256 bits for symmetric encryption, but said little
   else. After much argumentation on my part, I finally convinced them that
   128-bits was more than sufficient.
4) Why do snake oil crypto salesmen almost always refer to some obsessively
   long key size, as if that is the only thing that matters? Could it be that
   their development teams really think that that's true and their sales teams
   also know that's what most other development teams believe and so that is
   what gets emphasized?

It could be that they treat it as paramount because for symmetric encryption
at least, it's probably the simplest thing to understand. That may explain
the obsession with it in the general development community, but it doesn't
make it right.

> One recent good answer I got relating keysizes:
>
> http://lists.opendnssec.org/pipermail/opendnssec-user/2012-January/001619.html
>
> See last paragraph about ECDSA keys. Someone might be interested in reading the
> whole "Default ZSK sizes" thread - for example, we found that registrars shared
> ZSKs for tens of thousands of domains (also tons of 512 bit RSA keys). The
> registrars are slowly rolling them over ATM.

Great link; thanks. Some have been urging that we should be moving from
1024-bit RSA keys to 2048 bit ones just because of the NIST push and all
the commercial CAs saluting the NISt flag. This might help push against
that trend. They apparently think that the extra key size comes for free.

> And with ECDSA you have to avoid weak RNGs when generating signatures...
>
> Estimating RSA key size: it's more an educated guess/magic given how the sizes
> are derived than anything else. And if you base your estimate for given time
> window on Lenstra or ECRYPT II keysize recommendations you might get reprimanded
> for suggesting too conservative values :-)

When someone asks, I first generally point them to http://www.keylength.com/
and once they've read that, I tell them we can talk. At least that way,
they aren't arguing with me. :)

>
>> 2) The have no clue as to what cipher modes are. It's ECB by default.
>
> Following image in the wiki article worked like charm in explaining why ECB is
> usually not what the developer wants:
>
> https://secure.wikimedia.org/wikipedia/en/wiki/Block_cipher_modes_of_operation#Electronic_codebook_.28ECB.29

I know of it well. That's one way that I convinced the ESAPI development
leaders that we had to switch.

>> 3) More importantly, they don't know how to choose a cipher mode (not
>>     surprising, given #2). They need to understand the trade-offs.
>> 4) They have no idea about how to generate keys, derived keys, IVs,
>
> A good article explaining the Vaudenay padding oracle (mostly leaves imprint on
> developers if they dedicate the 30-60 min to read it thoroughly), explains why
> modes are important along the way and gives insight into "engineer's mind":
>
> http://chargen.matasano.com/chargen/2009/7/22/if-youre-typing-the-letters-a-e-s-into-your-code-youre-doing.html

Thanks for the link. It took me a LONG time to convince the ESAPI team
of this because I was the newb to them and I came in and said we
need to at least need to add a MAC over the IV+ciphertext. But it
took me a really long time to convince them because I could not remember
Vaudenay's name (so sorry if you are out there reading this!) and neither
could I recall the details of how it was done. I finally stumbled upon
it while Googling for cryptographic attacks against IPSec, which I remembered
was one of the things originally affected.

Unfortunately, they didn't want to use something like GCM or CCM, which
were NIST-approved, because they were not in the standard SunJCE
implementation and given that we already had somehting like 30 jar
as dependencies, they didn't want to add another like Bouncy Castle's.

Thanks to feedback from a few of you on this mailing list (well, actually,
it was Perry Metzger's list at metzdowd.com, but basically the same
cast), I think we got it right...at least if we can believe the NSA.
(Me, I'm still a bit skeptical for reasons that have to do with what
in retrospect seems like something obvious that just turned up that
the NSA did not mention. Can't say more until we get it patched.)

> There was even better article from Matasano that showed the Vaudenay's attack
> nicely step-by-step, included commentary about IV selection (can't find it
> right now).

If you do find it, please let me know. Usually I just point developers at
the YouTube video by Duong and Rizzo using POET to crack ASP.NET's
encryption and that convinces most of them.

> NaCl library - should have sane enough defaults to make it hard for a
> non-crypto-devoted developer to screw things up:
>
> http://nacl.cr.yp.to/

I've looked at it. Unfortunately, there is only a C, C++, and Python
implementation. We desparately need something like this for Java, C#,
Ruby, etc.

> (Also the already mentioned PBKDF2 - for some reason people seem to fear the
> five-letter acronym, but once you explain how simple it really is and that it's
> present in every serious crypto library, they catch up quickly.)

OK in some cases, but in security audits, I've found that the few developers
who do use it are still picking simple 8 character passwords that are prone
to cracker dictionary attacks.

> Question regarding the IVs: is there a cryptographically-secure pseudo-random
> permutation generation function for that? I've seen notices in this thread but
> can't remember if one was named explicitly. (Sure one could code one up from a
> PRNG, but I wouldn't want to be the person that would guarantee its safety.)
> Such function is generally useful in non-crypto ways, too.

At work, I've seen *WAY TOO MANY* uses of crypto where the developers have
used a fixed IV simply so they would not have to store the IV with the
ciphertext in a DB. They claim that they need to do this because "we are
storing millions of records". Right.

> RC4: I personally wouldn't touch it even in HAZMAT suit, because I know I don't
> know enough to use it securely. There's been many papers about theoretical
> weaknesses (like being able to distinguish from random). AFAIK it should be OK
> in TLS cipher suites, but I have to take someone's word for it. When Langley and
> Laurie decide to use it for Google services in ECDHE-RSA-RC4-SHA, they probably
> know what they are doing, but you seriously can't expect a generic engineer to
> know this.

I've rarely seen RC4 used outside of SSL/TLS and some encrption that Microsoft
Office did (and did incorrectly) back in the early 2000s.

>> 5) They don't know what padding is, or when/why to use it.
>
> I vaguely remember some past attacks on (I think) PKCS#1 padding, it was long
> time ago (I'm guessing it's fixed in PKCS#1-1.5, right?). What about OAEP? I
> also have vague notion of a past paper that appeared to poke holes in it (maybe
> I'm confusing it with something else?)

IIRC, there were some attacks on PKCS#1 padding with RSA. I generally
just say if you are using padding with asymmetric encryption, use
OAEPWithSHA-256AndMGF1Padding. Not sure that is valid with ciphers
other than RSA though. Is it safe for others too?

> IIRC NaCl library should do this "right by default". Padding can be tricky.

To the majority of developers that I work with "right by default" means
that it had better have a Java or C#/.NET implementation.

> Question about low RSA exponent when making signature: how common do you expect
> this implementation bug (a variation on Bleichenbacher's attack on PKCS#1) be
> present in various RSA-signature-verifying software?:
> http://www.cdc.informatik.tu-darmstadt.de/reports/reports/sigflaw.pdf
>
> That is exactly the kind of bug that a common developer is very likely to make.

Well, even the Java and .NET implementations have code to generate RSA key
pairs. I never really checked it, but hopefully they got it right. But
regardless, anymore, that's not the type of code that I'd expect most
developers to be writing. If I found someone doing that, I'd surely
be rapping their knuckles with a ruler. ;-)

> IIRC DNSSEC RRSIGs uses exactly the same PKCS#1 v1.5 padding. Did anyone check
> BIND, unbound, nsd for that? (I have it in TODO, didn't get to it yet.)

Never looked and it's not something that's on my radar. Honestly, I think
hell will freeze over before most companies make the switch to DNSSEC.
It surely is not on our radar at work, except perhaps in a few cases
where we have contracts with the federal government and FISMA compliance
issues.

>> 6) They have a very naive concept of entropy...where/when to use it and
>>     from where and how to obtain it.
>
> I admit that I don't fully understand entropy in crypto sense (though I'm fine
> with its physical meaning). No offense to any of you, but my impression from the
> entropy discussion is that no one can understand it fully, it's simply damn
> complex subject to understand correctly (taking in account that we don't really
> have a precise definition).

Entropy in information theory surely is a much more difficult concept than
it is in physics. It doesn't help that to know whether you are doing
it correctly or not you have to have a fairly deep understanding of
statistics which most developers don't have and don't even want to
know about.

> Few days ago I've shown the Syllable /dev/random code
> (http://syllable.cvs.sourceforge.net/viewvc/syllable/syllable/system/sys/kernel/drivers/misc/random/random.c?revision=1.4&view=markup)
> to about 10-15 SW/HW guys and asked them to find the mistake in random_read.
> One guy got it right (a crypto enthusiast), one was totally lost, third was
> arguing that it was ok, because kernel rand() uses Mersenne Twister instead of
> the age-old LCG rand(), but didn't catch that the seed has really low entropy
> (seconds since last boot). Rest of them probably didn't care or it wasn't
> interesting enough.

I thought most *nix kernels with /dev/random save some entropy when they
shutdown and reinitialize from that upon booting. Not so?

> Quoting an interesting point from Marsh Ray
> (http://lists.randombit.net/pipermail/cryptography/2012-February/002435.html):
>> Of course the "unpredictability" in this definition of entropy refers to
>> a theoretical model that is the basis of proofs and so we can't argue
>> with it in the company of mathematicians. But it translates very easily
>> into an intuitive, practical model - which is actually a wrong one.
>
> Obviously no one can tell this to a developer and expect the developer to
> automagically do things right.

Obviously, it depends a lot on experience. Sure, most developers will
Google for something that don't know, but it's not always the reputable
links that are highest on the hit list. So instead, you end up with
developers sharing ignorance with each other on some developer forum.
Part of it is that they just don't know any authorative sources and
part of it no one thinks to try something like Google Scholar for
their searches.

>
> My point: expecting those SW/HW devs to get it right might be setting the bar
> way too high. Maybe explaining that "it's really hard, go ask a crypto
> professional who sleeps with RNG under pillow" might yield better results.

That's why I haunt these lists. Mostly lurking and trying to learn from
experts.

Maybe this will motivate more of you to participate for free in the
open source community...amongst the OWASP community and where I work,
I am considered a "subject matter expert" in applied cryptography.

I compare that with someone equating a witch doctor with someone who
has an MD. And I'm not being modest. Seriously, about the only thing
that separates me from my peers is that I (mostly) know what I don't
know so I'm one level of ignorance higher than those who don't know
what they don't know. ;-)

[snip]

> --- End of talking on Kevin's points; other points from the thread
>
> - Dunning-Kruger effect:
>
> Disagree. Not common from my experience. Developers usually google, ask
> questions if not sure. Except maybe for the class of "code-pasters". Or it's
> possible I just got lucky in each company (well except one, but thankfully I got
> fired after 4 days after enrollment for crushing their dreams of "analyzer of
> every malware" and other insanities by mentioning halting problem).

I once had a manager that asked something similar and we pointed out
that unless he had proven that P == NP, the problem he was asking us
to solve in polymomial time was considered NP hard. But at least we
weren't fired for it.

> - Mentors:
>
> It's extremely hard to find the right mentor for crypto, mostly because there
> are like 4 per 10M population and each of them specializes in a very narrow
> field (e.g. "pure" algebra, side channels). Thus you get maybe 1-2 per 10M
> population that could assist a SW/HW engineer in implementing crypto correctly.

It can't be THAT bad. I might be remembering wrong, but I thought that I
recently read that according to the Bureau of Labor Statistics there were
only about 4M IT workers in the USA. And I'm pretty sure there are more
than 4 professional cryptographers subscribed to this list. But I get your
point.

That means that it's important for you to get the biggest bang for the
buck. I'n not sure how helpful it will be because it hasn't started yet,
but I am really looking forward to Dan Boneh's cryptography course
that Stanford is allowing the public to take. I've never had the luxury
of a formal class in it so I'm really looking forward to it starting.

> I don't envy the debian guy introducing the OpenSSL bug. He tried to ask,
> there was misunderstanding, boom.
>
> In the debian case, maybe some people lost few thousands of dollars. In Tor
> case, people die or even worse - get tortured for years if developer makes
> mistake. Obfsproxy could definitely use few extra peeks from "crypto-eyes"
> that understand randomness.

There's an opportunity for someone to help. If egregious crypto errors
are made in this like they were in (say) WEP, as you say, people's lives
may be at stake.

> - Special cases in crypto:
>
> It's really hard to keep up with developments in crypto if it's not one's
> primary field. The publications often look like an assortment of special cases.

Not to mention that there are few out there who can translate these papers
into terms that developers can understand. A substantial number of the
papers over at IACR make my head hurt, and there's quite a few of them
that I can't understand at all even if I concentrate through the
headaches.  ;-)  And I don't think that most developers even know
about IACR and probably wouldn't even read it (because of the
level of difficulty) if they did.

So someone needs to produce the developer's _Reader's Digest_
version of all things crypto that a developer would need to know.
Maybe there are already such things out there. I mentioned Nate Lawson's
blog; he writes about crypto fairly regularly. And Schneier of course.

But if I could put to something that was about 5-8 pages about something
like "Ten Things Every Developer Should Know About Cryptography", that
would be great for starters. Does such a thing exist? Maybe it can't
distilled to only 10, but you get my point.

[big snip]

Again, thanks for your response. Looking forward to the follow-up.

-kevin
-- 
Blog: http://off-the-wall-security.blogspot.com/
"The most likely way for the world to be destroyed, most experts agree,
is by accident. That's where we come in; we're computer professionals.
We *cause* accidents."        -- Nathaniel Borenstein



More information about the cryptography mailing list