How to decrypt OpenSSL AES-encrypted files in Python?

101,271

Solution 1

Given the popularity of Python, at first I was disappointed that there was no complete answer to this question to be found. It took me a fair amount of reading different answers on this board, as well as other resources, to get it right. I thought I might share the result for future reference and perhaps review; I'm by no means a cryptography expert! However, the code below appears to work seamlessly:

from hashlib import md5
from Crypto.Cipher import AES
from Crypto import Random

def derive_key_and_iv(password, salt, key_length, iv_length):
    d = d_i = ''
    while len(d) < key_length + iv_length:
        d_i = md5(d_i + password + salt).digest()
        d += d_i
    return d[:key_length], d[key_length:key_length+iv_length]

def decrypt(in_file, out_file, password, key_length=32):
    bs = AES.block_size
    salt = in_file.read(bs)[len('Salted__'):]
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    next_chunk = ''
    finished = False
    while not finished:
        chunk, next_chunk = next_chunk, cipher.decrypt(in_file.read(1024 * bs))
        if len(next_chunk) == 0:
            padding_length = ord(chunk[-1])
            chunk = chunk[:-padding_length]
            finished = True
        out_file.write(chunk)

Usage:

with open(in_filename, 'rb') as in_file, open(out_filename, 'wb') as out_file:
    decrypt(in_file, out_file, password)

If you see a chance to improve on this or extend it to be more flexible (e.g. make it work without salt, or provide Python 3 compatibility), please feel free to do so.

Notice

This answer used to also concern encryption in Python using the same scheme. I have since removed that part to discourage anyone from using it. Do NOT encrypt any more data in this way, because it is NOT secure by today's standards. You should ONLY use decryption, for no other reasons than BACKWARD COMPATIBILITY, i.e. when you have no other choice. Want to encrypt? Use NaCl/libsodium if you possibly can.

Solution 2

I am re-posting your code with a couple of corrections (I didn't want to obscure your version). While your code works, it does not detect some errors around padding. In particular, if the decryption key provided is incorrect, your padding logic may do something odd. If you agree with my change, you may update your solution.

from hashlib import md5
from Crypto.Cipher import AES
from Crypto import Random

def derive_key_and_iv(password, salt, key_length, iv_length):
    d = d_i = ''
    while len(d) < key_length + iv_length:
        d_i = md5(d_i + password + salt).digest()
        d += d_i
    return d[:key_length], d[key_length:key_length+iv_length]

# This encryption mode is no longer secure by today's standards.
# See note in original question above.
def obsolete_encrypt(in_file, out_file, password, key_length=32):
    bs = AES.block_size
    salt = Random.new().read(bs - len('Salted__'))
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    out_file.write('Salted__' + salt)
    finished = False
    while not finished:
        chunk = in_file.read(1024 * bs)
        if len(chunk) == 0 or len(chunk) % bs != 0:
            padding_length = bs - (len(chunk) % bs)
            chunk += padding_length * chr(padding_length)
            finished = True
        out_file.write(cipher.encrypt(chunk))

def decrypt(in_file, out_file, password, key_length=32):
    bs = AES.block_size
    salt = in_file.read(bs)[len('Salted__'):]
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    next_chunk = ''
    finished = False
    while not finished:
        chunk, next_chunk = next_chunk, cipher.decrypt(in_file.read(1024 * bs))
        if len(next_chunk) == 0:
            padding_length = ord(chunk[-1])
            if padding_length < 1 or padding_length > bs:
               raise ValueError("bad decrypt pad (%d)" % padding_length)
            # all the pad-bytes must be the same
            if chunk[-padding_length:] != (padding_length * chr(padding_length)):
               # this is similar to the bad decrypt:evp_enc.c from openssl program
               raise ValueError("bad decrypt")
            chunk = chunk[:-padding_length]
            finished = True
        out_file.write(chunk)

Solution 3

The code below should be Python 3 compatible with the small changes documented in the code. Also wanted to use os.urandom instead of Crypto.Random. 'Salted__' is replaced with salt_header that can be tailored or left empty if needed.

from os import urandom
from hashlib import md5

from Crypto.Cipher import AES

def derive_key_and_iv(password, salt, key_length, iv_length):
    d = d_i = b''  # changed '' to b''
    while len(d) < key_length + iv_length:
        # changed password to str.encode(password)
        d_i = md5(d_i + str.encode(password) + salt).digest()
        d += d_i
    return d[:key_length], d[key_length:key_length+iv_length]

def encrypt(in_file, out_file, password, salt_header='', key_length=32):
    # added salt_header=''
    bs = AES.block_size
    # replaced Crypt.Random with os.urandom
    salt = urandom(bs - len(salt_header))
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    # changed 'Salted__' to str.encode(salt_header)
    out_file.write(str.encode(salt_header) + salt)
    finished = False
    while not finished:
        chunk = in_file.read(1024 * bs) 
        if len(chunk) == 0 or len(chunk) % bs != 0:
            padding_length = (bs - len(chunk) % bs) or bs
            # changed right side to str.encode(...)
            chunk += str.encode(
                padding_length * chr(padding_length))
            finished = True
        out_file.write(cipher.encrypt(chunk))

def decrypt(in_file, out_file, password, salt_header='', key_length=32):
    # added salt_header=''
    bs = AES.block_size
    # changed 'Salted__' to salt_header
    salt = in_file.read(bs)[len(salt_header):]
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    next_chunk = ''
    finished = False
    while not finished:
        chunk, next_chunk = next_chunk, cipher.decrypt(
            in_file.read(1024 * bs))
        if len(next_chunk) == 0:
            padding_length = chunk[-1]  # removed ord(...) as unnecessary
            chunk = chunk[:-padding_length]
            finished = True 
        out_file.write(bytes(x for x in chunk))  # changed chunk to bytes(...)

Solution 4

This answer is based on openssl v1.1.1, which supports a stronger key derivation process for AES encryption, than that of previous versions of openssl.

This answer is based on the following command:

echo -n 'Hello World!' | openssl aes-256-cbc -e -a -salt -pbkdf2 -iter 10000 

This command encrypts the plaintext 'Hello World!' using aes-256-cbc. The key is derived using pbkdf2 from the password and a random salt, with 10,000 iterations of sha256 hashing. When prompted for the password, I entered the password, 'p4$$w0rd'. The ciphertext output produced by the command was:

U2FsdGVkX1/Kf8Yo6JjBh+qELWhirAXr78+bbPQjlxE=

The process for decrypting of the ciphertext above produced by openssl is as follows:

  1. base64-decode the output from openssl, and utf-8 decode the password, so that we have the underlying bytes for both of these.
  2. The salt is bytes 8-15 of the base64-decoded openssl output.
  3. Derive a 48-byte key using pbkdf2 given the password bytes and salt with 10,000 iterations of sha256 hashing.
  4. The key is bytes 0-31 of the derived key, the iv is bytes 32-47 of the derived key.
  5. The ciphertext is bytes 16 through the end of the base64-decoded openssl output.
  6. Decrypt the ciphertext using aes-256-cbc, given the key, iv, and ciphertext.
  7. Remove PKCS#7 padding from plaintext. The last byte of plaintext indicates the number of padding bytes appended to the end of the plaintext. This is the number of bytes to be removed.

Below is a python3 implementation of the above process:

import binascii
import base64
import hashlib
from Crypto.Cipher import AES       #requires pycrypto

#inputs
openssloutputb64='U2FsdGVkX1/Kf8Yo6JjBh+qELWhirAXr78+bbPQjlxE='
password='p4$$w0rd'
pbkdf2iterations=10000

#convert inputs to bytes
openssloutputbytes=base64.b64decode(openssloutputb64)
passwordbytes=password.encode('utf-8')

#salt is bytes 8 through 15 of openssloutputbytes
salt=openssloutputbytes[8:16]

#derive a 48-byte key using pbkdf2 given the password and salt with 10,000 iterations of sha256 hashing
derivedkey=hashlib.pbkdf2_hmac('sha256', passwordbytes, salt, pbkdf2iterations, 48)

#key is bytes 0-31 of derivedkey, iv is bytes 32-47 of derivedkey 
key=derivedkey[0:32]
iv=derivedkey[32:48]

#ciphertext is bytes 16-end of openssloutputbytes
ciphertext=openssloutputbytes[16:]

#decrypt ciphertext using aes-cbc, given key, iv, and ciphertext
decryptor=AES.new(key, AES.MODE_CBC, iv)
plaintext=decryptor.decrypt(ciphertext)

#remove PKCS#7 padding. 
#Last byte of plaintext indicates the number of padding bytes appended to end of plaintext.  This is the number of bytes to be removed.
plaintext = plaintext[:-plaintext[-1]]

#output results
print('openssloutputb64:', openssloutputb64)
print('password:', password)
print('salt:', salt.hex())
print('key: ', key.hex())
print('iv: ', iv.hex())
print('ciphertext: ', ciphertext.hex())
print('plaintext: ', plaintext.decode('utf-8'))

As expected, the above python3 script produces the following:

openssloutputb64: U2FsdGVkX1/Kf8Yo6JjBh+qELWhirAXr78+bbPQjlxE=
password: p4$$w0rd
salt: ca7fc628e898c187
key:  444ab886d5721fc87e58f86f3e7734659007bea7fbe790541d9e73c481d9d983
iv:  7f4597a18096715d7f9830f0125be8fd
ciphertext:  ea842d6862ac05ebefcf9b6cf4239711
plaintext:  Hello World!

Note: An equivalent/compatible implementation in javascript (using the web crypto api) can be found at https://github.com/meixler/web-browser-based-file-encryption-decryption.

Share:
101,271
Thijs van Dien
Author by

Thijs van Dien

Updated on July 09, 2022

Comments

  • Thijs van Dien
    Thijs van Dien almost 2 years

    OpenSSL provides a popular (but insecure – see below!) command line interface for AES encryption:

    openssl aes-256-cbc -salt -in filename -out filename.enc
    

    Python has support for AES in the shape of the PyCrypto package, but it only provides the tools. How to use Python/PyCrypto to decrypt files that have been encrypted using OpenSSL?

    Notice

    This question used to also concern encryption in Python using the same scheme. I have since removed that part to discourage anyone from using it. Do NOT encrypt any more data in this way, because it is NOT secure by today's standards. You should ONLY use decryption, for no other reasons than BACKWARD COMPATIBILITY, i.e. when you have no other choice. Want to encrypt? Use NaCl/libsodium if you possibly can.