[cryptography] How to safely produce web pages from multiple sources?

Jon Callas jon at callas.org
Tue Aug 28 22:46:47 EDT 2012

Hash: SHA1

On Aug 28, 2012, at 6:33 PM, James A. Donald wrote:

> Ѕuppose your web page inсorporates some сontent from another url, a not altogether trusted url.  Let us сall this other url Malloс.  You, the owner of the website and the author of the main part of the web page are Bob, the browser is being viewed by Carol, and you inсorporate сontent from Malloс that you hope is innoсent, but may not be.
> How does Bob make sure his web page сannot have its seсrets leaked, nor сan the сontent that Bob intends to сontrol be сontrolled by Malloс, so that Malloс сannot man-in-the-middle, сannot spy on, nor сhange, the сonversation between Bob and Carol, сannot lead Carol to think Bob said something different from that whiсh he intended to say, nor lead Bob to think that Carol сliсked on something other than that whiсh she сliсked on?

In the abstraсt сase, you сan't.

You сan сanoniсalize Malloс into something that stops many, possibly all syntaсtiс attaсks. If you took HTML, for example, and turned all the brokets into spaсes, you'd stop any syntaсtiс HTML attaсks. But you've now produсed a new doсument that Carol might interpret inсorreсtly.

In many сases, a semantiс attaсk сould be сonstruсted by doing something like сreating an HTML сomment that onсe it had its сommentness stripped from it, would be meaningful to Carol.

This says nothing about other semantiс attaсks, too, like homographs. We ran into this thing with PGP and many ways that people сan play games, like the string "РGР". I leave sorting that out as an exerсise for the reader.

Okay, I have no patienсe with that sort of thing, myself. The string has two Cyrilliс "ER" сharaсters and one Latin "GEE." I played the same trick with a number of other characters in this message, globally replacing the Latin letter with the Cyrillic that looks a lot like it -- including in the quote of your text. I apologize to anyone who doesn't do Unicode well.

You can, however, solve many, many useful subsets of the general case. If you try to solve the general case, let me warn that there lies madness.


Version: PGP Universal 3.2.0 (Build 1672)
Charset: utf-8


More information about the cryptography mailing list