[cryptography] [ramble] [tldr] Layered security where encryption is used?

Ben Lincoln F70C92E3 at beneaththewaves.net
Sun Jul 21 16:34:00 EDT 2013


Hi everyone.

This is very possibly a newb question (or series of questions), and if 
so I apologize in advance. I scoured everywhere I could think of for the 
last couple of days trying to find information on this and came up 
empty, but maybe I just didn't know the right terms to search for.

--- Background ---

I was reverse-engineering a system recently and came across an issue 
that I know from experience and training is pretty widespread when 
developers without a strong cryptography background use cryptography 
without thinking things through: although strong cryptography (AES) was 
used, and the key was stored securely, the system itself unintentionally 
provided a means for an attacker to decrypt arbitrary data without ever 
knowing the key. I have a set of recommendations in mind for how to 
avoid this type of vulnerability, but I'd like to sanity-check them with 
people who actually do have a cryptography background.

What I'm hoping to avoid is being the security guy who makes a 
recommendation for improving something, but unintentionally introduces a 
different vulnerability as a result. I have gotten pretty good at 
exploiting commonly-made mistakes in software that uses cryptography, 
but I am not a cryptography expert, or even cryptography adept.

The system in question uses a fairly common mechanism where the state of 
certain non-sensitive variables is maintained on the client by means of 
encrypted data which the client doesn't have the key to decrypt. The 
only reason for the encryption is to prevent the client from tampering 
with the data. This allows multiple different load-balanced nodes on the 
back-end to respond to requests from the same client without having to 
sync their state. Think of ASP.NET's ViewState, except that here the 
variables are broken out into individual components instead of there 
being one giant encrypted blob that contains all of the data.

Like many systems that use this model, there is a flaw that would 
(probably?) be trivial in the absence of other factors: some of those 
non-sensitive values are displayed back to the user after being 
decrypted. In other words, as I mentioned above, the system includes the 
unintentional ability for users to decrypt arbitrary data, as long as 
that data was encrypted using the same key as the data it actually expects.

Unfortunately, there is other - sensitive - data in the system which is 
also encrypted using the same key. It's data that must be stored in a 
reversibly-encrypted format, but which end users should not be able to 
retrieve. For the sake of argument, let's say it's the password for a 
service account that the system uses to execute batch jobs, or a stored 
credit card number used to make purchases by a customer. In both cases, 
the system needs the ability to obtain the original value, but end users 
do not - they just refer to the value abstractly, such as "use this 
service account to execute this task", or "I want to make a purchase 
using the card whose number ends in 1234". I am using examples from 
other systems that I've looked at in the past here as opposed to the 
current one, so please don't get stuck on those two specific cases. 
Assume that there is a requirement that the system be able to decrypt 
the data, but that it should not be accessible to end users after it's 
originally entered.

The combination of those two aspects of the system means that if a user 
can obtain the encrypted version of the second type of data, they can 
feed it into their cookie, and the system will happily display to them 
the decrypted value, because it doesn't know any better and because the 
same key is used for both types of data.

Now, normally users can't actually obtain this sensitive data, even in 
encrypted format - there are OS- and database-level permissions that are 
supposed to prevent that - but over time, people have a tendency to 
forget why certain things were configured the way they were, someone 
makes a configuration change, and people who shouldn't be able to get to 
the encrypted data are suddenly able to.

--- Proposal/Question ---

Of course, one of my main recommendations is going to be "don't use the 
same key for multiple types of data!!", but because my background is in 
systems engineering, one of my interests is building redundant safety 
features into a system design so that any one failure or human error 
won't completely compromise the system.

Part 1 of my proposal is that encrypted values should be wrapped in some 
kind of metadata to identify their type, as well as delimit where the 
plaintext value starts and ends (to help prevent someone from using 
block-shuffling in a way that involves changing the length of the 
desired plaintext, if someone makes a mistake and uses ECB mode instead 
of CBC). Some really basic examples of the plaintext might be:

<password>12345? That's the same combination as on my luggage!</password>
versus
<customThemeName>Autumn</customThemeName>

...or...

[value&&type::password&&length::52]12345? That's the same combination as 
on my luggage![/value]
versus
[value&&type::customThemeName&&length::6]Autumn[/value]

This is obviously going to involve an increase in storage size. For 
example, using the "Autumn" example and XML-style wrapper, with a block 
size of 128 bits, the ciphertext balloons from (size of IV + 16 bytes) 
to (size of IV + 48 bytes). The benefit I see is that it allows the 
application to do a check to make sure that the type of data it has just 
decrypted is actually of the type it expects, prevent other types of 
data from being returned to the user, and possibly generate an alert if 
it was expecting e.g. the name of a custom webpage theme but found a 
service account password instead. There is a whole side-topic here 
related to making sure that mechanism isn't itself exploitable, but I 
will set that aside because then the email would be even longer.

As I said, the application I'm asking about uses strong encryption for 
which there are no known known-plaintext attacks. However, as soon as I 
thought of the above concept, I realized that if a practical 
known-plaintext attack were ever discovered for AES, that scheme would 
be setting up the system for compromise, because all values of a certain 
type would have at least their first block of plaintext be 
highly-predictable.

So part 2 of my proposal is that the plaintext include a throwaway 
section *before* the actual data of concern, which has a length of one 
block, and is filled with random (or at least pseudo-random) data that 
is uniquely-generated for each encrypted value. As long as CBC mode was 
used, it seems to me that it would be sort of like a second IV (a 
"reinitialization vector", I guess? :)), except that it would never be 
stored outside of the ciphertext, would be immediately discarded upon 
decryption, and never intentionally reused. In other words, while I see 
it as serving a purpose somewhat related to an IV, I also see them as 
being complementary to each other instead of redundant - the IV helps 
ensure that identical plaintext encrypts to different ciphertext, and 
the "RIV" helps guard against future known-plaintext attacks when used 
with CBC encryption mode.

This is probably stating the obvious, but in the case of one of the 
examples above, if the encryption used were AES or another algorithm 
with a block size of 128 bits, the plaintext modified according to both 
parts 1 and 2 of my proposal would look like this:

XXXXXXXXXXXXXXXX[value&&type::password&&length::52]12345? That's the 
same combination as on my luggage![/value]

...where XXXXXXXXXXXXXXXX represents 16 bytes of random/pseudorandom 
values from 0-255. This whole long set of plaintext would then be 
encrypted, appended to the IV, and finally stored.

To hammer home the storage downside of this, the result is that what was 
originally going to be potentially an 80-byte value (16-byte IV + 
64-byte ciphertext) has swollen to 148 bytes (16-byte IV + 128-byte 
ciphertext). Because the password in question is unusually long, let's 
say that it generally doubles or triples the size of the stored data, 
and of course increases CPU time for encryption and decryption.

However, at least superficially, I think it greatly reduces the 
likelihood of sensitive data being obtained by people who shouldn't have 
it, because it provides a means of allowing the application to perform 
"output validation" before displaying values to the user, and (unless 
I'm mistaken) it guards against future known-plaintext attacks on the 
encryption algorithm. In combination with using different encryption 
keys for different types of data (and of course using unique IVs for 
each encrypted value), it seems to me that it makes it much less likely 
for any one mistake to compromise the system.

--- Wrapping up ---

I can definitely see an argument that this is a bunch of 
over-engineering, but the type of flaw I'm describing is ridiculously 
widespread in commercial software. I'd already run into it myself, and 
then when I went to a SANS advanced web pen-testing course there was an 
entire day dedicated to it and related defects.

I feel like I need to be able to make some recommendations to developers 
who aren't cryptography experts that will let them design and build 
systems that have a degree of built-in redundancy so that the failure of 
any one design element related to the encrypted data won't result in a 
complete compromise of that system. I need to be able to come up with a 
simple recipe for that, and it can't be any one silver bullet (like "use 
different keys for different types of data"), because single mechanisms 
will always fail at some point. It also can't be something unrealistic 
like "become an awesome cryptographer before you design any system that 
uses cryptographic algorithms", because I know that's not going to 
happen and I have to account for the reality of the situation. I feel 
like it needs to be 3-5 overlapping design philosophies/patterns that 
are easy to remember, in addition to the ones that are well-known like 
"use existing, well-vetted cryptographic algorithms instead of writing 
your own".

 From a cryptography perspective, is this a stupid idea? Are there 
better ways to achieve my goal? Am I introducing any new weaknesses into 
the system? Has any element of this topic been done to death and I just 
didn't know what to search for?

In any case, if anyone got to the end of this rambling email, thank you.

- Ben Lincoln


More information about the cryptography mailing list