[cryptography] Oddity in common bcrypt implementation
James A. Donald
jamesd at echeque.com
Mon Jun 20 17:24:04 EDT 2011
On 2011-06-21 5:09 AM, Marsh Ray wrote:
> There are certainly more bugs lurking where the complex rules of
> international character data "collide" with password hashing. How does a
> password login application work from a UTF-8 terminal (or web page) when
> the host is using a single-byte code page?
C, and existing C libraries, was created when all the characters that
anyone could ever want fitted in less than seven bits.
When we ran out of space, each hardware manufacturer and each programmer
implemented his own incompatible solution ad hoc.
The solution, of course, is more bits. The world is now standardizing
on Unicode. Anything that is more than seven bits, and less than
Unicode, is asking for endless compatibility crises.
Eight bit ascii is a compatibility bug.
> I once looked up the Unicode algorithm for some basic "case insensitive"
> string comparison... 40 pages!
When one goes truly international, case insensitivity is an AI hard
problem. Only some one with an intimate knowledge of the culture can
tell you if two text strings are in some sense the same, when they are
not exactly alike.
Humans are so good at judging that two things are almost the same, or
very similar, that we tend to overlook small differences, where
computers are incapable of noticing the similarity. This is apt to
create insoluble UI issues, for example the difference egold.com (all
alphabetic) and ego1d.com (letters and numbers). One has to design
around such problems. Don't go there!
More information about the cryptography