[cryptography] Oddity in common bcrypt implementation

Sampo Syreeni decoy at iki.fi
Tue Jun 28 15:09:46 EDT 2011

On 2011-06-28, Marsh Ray wrote:

> Yes, but in most actual systems the strings are going to get handled.

Is this really necessarily true, or just an artifact of how things are 
implemented now? Or even a simple-minded implementation.

Take the case of passwords and usernames. It might make some sense to do 
a case-insensitive username comparison. It wouldn't hurt security much, 
it might help usability and interoperability, and it constitutes 
desirable eye-candy.

But a case-insensitive password compare?!? For some reason I don't think 
anybody would want to go there, and that almost everybody would want the 
system to rather fail safe than to do anything but pass around 
(type-tagged) bits. I mean, would anybody really like a spell checker in 
their ATM?

> It's more a question of whether or not your protocol specification 
> defines the format it's expecting.

It's not all about that. There's also the issue of implementability, 
testability and ability to promote (and then withstand) analysis. Any 
system that dedicates fourty pages worth of text to string comparison 
doesn't have those attributes. It doesn't promote security proper, but 
rather bloated software, difficulties with interoperability, unsecure 
workarounds and even plain security through obscurity.

As a case in point, the Unicode normalization tables have changed 
numerous times in the past, and they aren't even the whole story. True, 
after some pressure from crypto folks they finally fixed the 
normalization target at something like v3.2 or whathaveyou. But then 
that too will in time lead to a whole bulk of special cases and other 
nastiness, which then promotes versioning difficulties, code that is too 
lengthy to debug properly, and diversion of resources from sound 
security engineering towards what I'm tempted to call "politically 
correct software engineering". I mean, you've certain already to have 
seen what happened in the IETF IDN WG wrt DNS phishing... If I ever saw 
a kluge, attempts at homograph elimination (a form of normalization) is 

> Humans tend to not define text very precisely and computers don't work 
> with it directly anyway, they only work with encoded representations 
> of text as character data.

Passwords aren't "text" in the normal sense. Precisely because they 
should be the only thing human keyed crypto should depend on for 
security. As for the rest of the text... Tag it and bag it as-is. At 
least the original intent can then be uncovered forensically, if need 
be. Unlike if you go around twiddling your bits on the way.

> Many devs (particularly Unixers :-) in the US, AU, and NZ have gotten 
> away with the "7 bit ASCII" assumption for a long time, but most of 
> the rest of the world has to deal with locales, code pages, and 
> multi-byte encodings.

Finnish people don't, and never have.

> Let's say you're writing a piece of code like:
> if (username == "root")
> {
> 	// avoid doing something insecure with root privs
> }
> The logic of this example is probably broken in important ways but the 
> point remains: sometimes we need to compare usernames for equality in 
> contexts that have security implications.

Then you write it out as "root" and it matches "root" because you wrote 
"root" the first time around. Plus, that is also why those security and 
interoperability sensitive things have been pared downto a minimum, 
common character set, in the first place.

> You can only claim "bytes are bytes" up until the point that the 
> customer says they have a directory server which compares usernames 
> "case insensitively".

If there's a security implication, you should then probably fail safe 
and wait for the software vendor to fix the possible interoperability 

> The first RFC http://tools.ietf.org/html/rfc2058#section-5.2 says 
> nothing about the encoding of the character data of the password 
> field, it just treats it as a series of octets.

Yeah. That's sloppy, compared to today's standards and environments. 
I've in fact often wondered why language/encoding/etc considerations 
aren't a mandatory section in an RFC, like security is. Even when 
dealing with manifestly user-input character data.

> Consequently, we can hardly blame users for not using special 
> characters in their passwords.

Can you really blaim the user for anything?
Sampo Syreeni, aka decoy - decoy at iki.fi, http://decoy.iki.fi/front
+358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2

More information about the cryptography mailing list