On 2011-06-20, Marsh Ray wrote:

> I once looked up the Unicode algorithm for some basic "case 
> insensitive" string comparison... 40 pages!

Isn't that precisely why e.g. Peter Gutmann once wrote against the 
canonicalization (in the Unicode context, "normalization") that ISO 
derived crypto protocols do, in favour of the "bytes are bytes" approach 
that PGP/GPG takes?

If you want to do crypto, just do crypto on the bits/bytes. If you 
really have to, you can tag the intended format for forensic purposes 
and sign your intent. But don't meddle with your given bits. 
Canonicalization/normalization is simply too hard to do right or even to 
analyse to have much place in protocol design.
