Compress and then encrypt, or vice-versa?

41,267

Solution 1

If the encryption is done properly then the result is basically random data. Most compression schemes work by finding patterns in your data that can be in some way factored out, and thanks to the encryption now there are none; the data is completely incompressible.

Compress before you encrypt.

Solution 2

Compress before encryption. Compressed data can vary considerably for small changes in the source data, therefore making it very difficult to perform differential cryptanalysis.

Also, as Mr.Alpha points out, if you encrypt first, the result is very difficult to compress.

Solution 3

Even if it depends on the specific use-case, I would advise Encrypt-then-Compress. Otherwise an attacker could leak information from the number of encrypted blocks.

We assume a user sending a message to the server and an attacker with the possibility to append text to the user message before sending (via javascript e.g.). The user wants to send some sensible data to the server and the attacker wants to get this data. So he can try to append different messages to the data the user sends to the server. Then the user compresses his message and the appended text from the attacker. We assume a DEFLATE LZ77 compression, so the function replaces same information with a pointer to first appearance. So if the attacker can reproduce the hole plaintext, the compression-function reduces the size of the plain text to the original size and a pointer. And after the encryption, the attacker can count the number of cipher blocks, so he can see, if his appended data were the same as the data the user sent to the server. Even if this case sounds a little bit constructed, it is a serious security issue in TLS. This idea is used by an attack called CRIME to leak cookies in a TLS connection to steal sessions.

source: http://www.ekoparty.org/archive/2012/CRIME_ekoparty2012.pdf

Solution 4

My view is that when you compress a message you project it to a lower dimension and therefor there are fewer bits, which means that the compressed message (assuming lossless compressioon) has the same information in fewer bits (the ones you got rid were redundant!) So you have more information per bit and consequently more entropy per bit, but the same total entropy as you had before when the message was not compressed. Now, randomness is another matter and that is where the patterns in compression can throw a monkey wrench.

Solution 5

Compression should be done before encryption. a user doesn't wants to spend time waiting for the transfer of data , but he/she needs it to be immediately done without wasting any time.

Share:
41,267
FJ de Brienne
Author by

FJ de Brienne

Updated on September 17, 2022

Comments

  • FJ de Brienne
    FJ de Brienne almost 2 years

    I am writing a VPN system which encrypts (AES256) its traffic across the net (Why write my own when there are 1,000,001 others already out there? Well, mine is a special one for a specific task that none of the others fit).

    Basically I want to run my thinking past you to make sure I'm doing this in the right order.

    At the moment packets are just encrypted before being sent out, but I want to add some level of compression to them to optimize the tranfer of data a little. Not heavy compression - I don't want to max out the CPU all the time, but I want to make sure the compression is going to be as efficient as possible.

    So, my thinking is, I should compress the packets before encrypting as an unencrypted packet will compress better than an encrypted one? Or the other way around?

    I will probably be using zlib for the compression.

    Read more on the Super User blog.

    • Eric Kittell
      Eric Kittell over 13 years
      Writing as "programming"? Would be better suited for Stack Overflow then.
    • FJ de Brienne
      FJ de Brienne over 13 years
      If I were asking about the programming of it, yes, but I'm not. This is a general compress then encrypt or encrypt then compress question which could apply to just working with plain files if you wanted. The programming side is just context for why I am asking the question.
    • BlueRaja - Danny Pflughoeft
      BlueRaja - Danny Pflughoeft over 13 years
    • FJ de Brienne
      FJ de Brienne over 13 years
      They know about compression there do they?
    • Everett
      Everett over 11 years
      @Majenko - They know about encryption, and most of them would know the answer is compress then encrypt. Of course they'd ask the question why you're using a block cipher instead of a stream cipher and point out that this will come at a price of speed (and that you should reconsider unless you already thought about it), and that maybe an elliptic curve cipher (eprints.usm.my/9413/1/…) would better suit. But I digress.
    • Pacerier
      Pacerier about 9 years
      @JeffFerland, crypto.stackexchange.com
    • Nicolas Roard
      Nicolas Roard about 9 years
      @Pacerier: Crypto.SE didn't exist at the time this question was asked.
  • Olli
    Olli over 13 years
    More important: compression adds entropy. Adding entropy is good for your encryption (harder to break with known-plaintext attacks).
  • GAThrawn
    GAThrawn over 13 years
    Also, encrypting costs resources, encrypting a smaller file will take less resources. So compress before encrypt.
  • Mitch
    Mitch over 13 years
    Aren't, conceptually, encryption and compression the same thing? Or rather, if encryption is done properly, (and compression is impossible) then you've really ended up compressing the data. (I guess it depends on one's definition of 'properly')
  • Konerak
    Konerak over 13 years
    Well, this is correct, but was posted 2 hours before you posted... Entropy
  • FJ de Brienne
    FJ de Brienne over 13 years
    No. Compression reduces the file size and can be undone by anyone with the decompression program. Encryption changes the content so that it can only be read by someone with the decryption key - the file size may stay the same, or maybe grow or shrink.
  • Martin Beckett
    Martin Beckett over 13 years
    @Olli - not necessarily if the compression scheme adds known text. In the worst case imagine if it put a known 512byte header on the front of the data and you were using a block mode encryption.
  • Olli
    Olli over 13 years
    @Martin: yes, that's true, it's not always good idea, I assumed "when doing it properly".
  • BlueRaja - Danny Pflughoeft
    BlueRaja - Danny Pflughoeft over 13 years
    I'm not sure why @Olli's comment would get upvoted, as it is incorrect; not only is it significantly less important, for any half-decent encryption it should be not important at all. That is, the strength of the encryption should be completely unrelated to the entropy of the message.
  • Edward Kmett
    Edward Kmett over 13 years
    If you compress at all, it can only really be done before encrypting the message, but bear in mind, this may leak information about 'compressability' of the original message, so you'll want to consider if there are any consequences to this side channel. Consider a fixed sized file that is either all 0s or a message. The all 0 file will result in a smaller payload under any reasonable compression scheme. Not likely an issue in this particular use case though.
  • peyman khalili
    peyman khalili over 13 years
    @Olli: Compression doesn't add entropy. But it does reduce non-entropy.
  • AbiusX
    AbiusX over 13 years
    Nop, less entropy = more patterns. Randomness has most entropy.
  • Zan Lynx
    Zan Lynx over 13 years
    But it is information entropy so it is all about meaning. Randomness doesn't mean anything so it doesn't apply. An English sentence can have letters changed and still mean the same thing so it has low entropy. A compressed English sentence might be unreadable if a single bit changes so it has the most. Or so I think.
  • AbiusX
    AbiusX over 13 years
    Entropy is not about sense and ability to read or understand, its all about patterns. Compressed files are full of patterns.
  • Zan Lynx
    Zan Lynx over 13 years
    @AbiusX: Right. Patterns. And the fewer patterns, the more entropy. Which means that compression which replaces all repeated patterns with a single copy increases entropy.
  • AbiusX
    AbiusX over 13 years
    no its not about quantity. Lots of patterns is not good. Quantity increases entropy. Its all about quality.
  • Pacerier
    Pacerier about 9 years
    @Olli, Your orange comment there is going to mislead alot of people. It's better to delete it.
  • fixer1234
    fixer1234 about 9 years
    This appears to be a repeat of the accepted and second answers. Each answer should contribute a substantively new solution to the question.
  • galaxis
    galaxis over 6 years
    @Olli, replace "entropy" with "obfuscation" and you may have something :).