[cryptography] ZFS dedup? hashes (Re: [zfs] SHA-3 winner announced)

David McGrew (mcgrew) mcgrew at cisco.com
Thu Oct 4 08:19:55 EDT 2012



On 10/3/12 4:29 PM, "Eugen Leitl" <eugen at leitl.org> wrote:

>----- Forwarded message from Sašo Kiselkov <skiselkov.ml at gmail.com> -----
>
>From: Sašo Kiselkov <skiselkov.ml at gmail.com>
>Date: Wed, 03 Oct 2012 22:19:19 +0200
>To: Dr Adam Back <adam at cypherspace.org>
>CC: Eugen Leitl <eugen at leitl.org>, cryptography at randombit.net
>Subject: Re: ZFS dedup? hashes (Re: [cryptography] [zfs] SHA-3 winner
>announced)
>User-Agent: Mozilla/5.0 (X11; Linux i686; rv:7.0.1) Gecko/20110929
>Thunderbird/7.0.1
>
>On 10/03/2012 04:19 PM, Dr Adam Back wrote:
>> I infer from your comments that you are focusing on the ZFS use of a
>>hash
>> for dedup?  (The forward did not include the full context).
>
>Correct, though not complete. ZFS uses the hash also for data integrity.
>
>> A forged
>> collision for dedup can translate into a DoS (deletion) so 2nd pre-image
>> collision resistance would still be important.
>> 
>> However 2nd pre-image collision resistance is typically offered at
>>higher
>> assurance than chosen pairs of collisions (because you can use birthday
>> effect to roughly square root the search space with pairs).  So to that
>> extent I agree your security reliance on hash properties is weaker than
>>for
>> integrity protection.  And SHA1 is still secure against 2nd pre-image
>> whereas its collision resistance has been demonstrated being below
>>design
>> strength.
>
>Due to the design of ZFS, it's fairly hard to pull off a successful
>collision even if one has a hash function that's widely broken. Also,
>the mitigation to this kind of problem is fairly simple: turn on data
>verification (many dedup deployments already use this). So while this
>clearly doesn't amount to a good analysis, by my estimates, this attack
>is highly improbable.
>
>> Incidentally a somewhat related problem with dedup (probably more in
>>cloud
>> storage than local dedup of storage) is that the dedup function itself
>>can
>> lead to the "confirmation" or even "decryption" of documents with
>> sufficiently low entropy as the attacker can induce you to "store" or
>> directly query the dedup service looking for all possible documents.  eg
>> say
>> a form letter where the only blanks to fill in are the name (known
>> suspected) and a figure (<1,000,000 possible values).
>
>This would require the user to be able to inject specific blocks into
>the dedup machinery and to have intimate knowledge of and access to the
>storage system anyway.
>
>> Also if there is encryption there are privacy and security leaks arising
>> from doing dedup based on plaintext.
>> 
>> And if you are doing dedup on ciphertext (or the data is not encrypted),
>> you
>> could follow David's suggestion of HMAC-SHA1 or the various AES-MACs.
>>In
>> fact I would suggest for encrypted data, you really NEED to base dedup
>>on
>> MACs and NOT hashes or you leak and risk bruteforce "decryption" of
>> plaintext by hash brute-forcing the non-encrypted dedup tokens.
>
>As noted before, ZFS (in Illumos) currently doesn't support encryption.
>Oracle's implementation does and once you enable encryption, it
>automatically switches to HMAC-SHA256.

It would be redundant to use HMAC-SHA256 in conjunction with authenticated
encryption modes like those mentioned on the Oracle webpage that I
mentioned (AES-GCM and AES-CCM).    Perhaps what you meant to say is that
when those modes are used, that SHA256 is used as the ZFS data-integrity
checksum?   Or is it the case that the data-integrity checksum can use a
keyed message authentication code?

>If we get around to implementing
>encryption in Illumos, we would most likely go the same route. Thanks
>for your insights, though, they are certainly valuable.

Is there any public specification for how cryptography is used in either
the Sun/Oracle version or the Illumos version of ZFS?

Thanks and regards,

David

>
>Cheers,
>--
>Saso
>
>----- End forwarded message -----
>-- 
>Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
>______________________________________________________________
>ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
>8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
>_______________________________________________
>cryptography mailing list
>cryptography at randombit.net
>http://lists.randombit.net/mailman/listinfo/cryptography




More information about the cryptography mailing list