How AES in CTR works for Python with PyCrypto?

问题

I am using python 2.7.1 I want to encrypt sth using AES in CTR mode. I installed PyCrypto library for python. I wrote the following code:

secret = os.urandom(16)
crypto = AES.new(os.urandom(32), AES.MODE_CTR, counter=lambda: secret)
encrypted = crypto.encrypt("asdk")
print crypto.decrypt(encrypted)

i have to run crypto.decrypt as many times as the byte size of my plaintext in order to get correctly the decrypted data. I.e:

encrypted = crypto.encrypt("test")
print crypto.decrypt(encrypted)
print crypto.decrypt(encrypted)
print crypto.decrypt(encrypted)
print crypto.decrypt(encrypted)

The last call to decrypt will give me the plaintext back. The other outputs from decrypt are some gibberish strings . I am wondering if this is normal or not? Do i have to include into a loop with size equal of my plaintext every time or i have gotten sth wrong?

回答1:

According to @gertvdijk, AES_CTR is a stream cipher which does not need padding. So I've deleted the related codes.

Here's something I know.

You have to use a same key(the first parameter in AES.new(...)) in encryption and decryption, and keep the key private.
The encryption/decryption methods are stateful, that means crypto.en(de)crypt("abcd")==crypto.en(de)crypt("abcd") is not always true. In your CTR, your counter callback always returns a same thing, so it becomes stateless when encrypt (I am not 100% sure it is the reason), but we still find that it is somewhat stateful in decryption. As a conclusion, we should always use a new object to do them.
The counter callback function in both encryption and decryption should behave the same. In your case, it is to make both of them return the same secret. Yet I don't think the secret is a "secret". You can use a random generated "secret" and pass it across the communicating peers without any encryption so that the other side can directly use it, as long as the secret is not predictable.

So I would write my cipher like this, hope it will offer some help.

import os
import hashlib
import Crypto.Cipher.AES as AES

class Cipher:

        @staticmethod
        def md5sum( raw ):
                m = hashlib.md5()
                m.update(raw)
                return m.hexdigest()

        BS = AES.block_size

        @staticmethod 
        def pad( s ):
                """note that the padding is no necessary"""
                """return s + (Cipher.BS - len(s) % Cipher.BS) * chr(Cipher.BS - len(s) % Cipher.BS)"""
                return s

        @staticmethod
        def unpad( s ):
                """return s[0:-ord(s[-1])]"""
                return s

        def __init__(self, key):
                self.key = Cipher.md5sum(key)
                #the state of the counter callback 
                self.cnter_cb_called = 0 
                self.secret = None

        def _reset_counter_callback_state( self, secret ):
                self.cnter_cb_called = 0
                self.secret = secret

        def _counter_callback( self ):
                """
                this function should be stateful
                """
                self.cnter_cb_called += 1
                return self.secret[self.cnter_cb_called % Cipher.BS] * Cipher.BS


        def encrypt(self, raw):
                secret = os.urandom( Cipher.BS ) #random choose a "secret" which is not secret
                self._reset_counter_callback_state( secret )
                cipher = AES.new( self.key, AES.MODE_CTR, counter = self._counter_callback )
                raw_padded = Cipher.pad( raw )
                enc_padded = cipher.encrypt( raw_padded )
                return secret+enc_padded #yes, it is not secret

        def decrypt(self, enc):
                secret = enc[:Cipher.BS]
                self._reset_counter_callback_state( secret )
                cipher = AES.new( self.key, AES.MODE_CTR, counter = self._counter_callback )
                enc_padded = enc[Cipher.BS:] #we didn't encrypt the secret, so don't decrypt it
                raw_padded = cipher.decrypt( enc_padded )
                return Cipher.unpad( raw_padded )

Some test:

>>> from Cipher import Cipher
>>> x = Cipher("this is key")
>>> "a"==x.decrypt(x.encrypt("a"))
True
>>> "b"==x.decrypt(x.encrypt("b"))
True
>>> "c"==x.decrypt(x.encrypt("c"))
True
>>> x.encrypt("a")==x.encrypt("a")
False #though the input is same, the outputs are different

Reference: http://packages.python.org/pycrypto/Crypto.Cipher.blockalgo-module.html#MODE_CTR

回答2:

I'm going to elaborate on @gertvdijk's explanation of why the cipher behaved the way it did in the original question (my edit was rejected), but also point out that setting up the counter to return a static value is a major flaw and show how to set it up correctly.

Reset the counter for new operations

The reason why this behaves as you described in the question is because your plain text (4 bytes / 32 bits) is four times as small as the size of the key stream blocks that the CTR cipher outputs for encryption (16 bytes/128 bits).

Because you're using the same fixed value over and over instead of an actual counter, the cipher keeps spitting out the same 16 byte blocks of keystream. You can observe this by encrypting 16 null bytes repeatedly:

 >>> crypto.encrypt('\x00'*16)
'?\\-\xdc\x16`\x05p\x0f\xa7\xca\x82\xdbE\x7f/'
>>> crypto.encrypt('\x00'*16)
'?\\-\xdc\x16`\x05p\x0f\xa7\xca\x82\xdbE\x7f/'

You also don't reset the cipher's state before performing decryption, so the 4 bytes of ciphertext are decrypted against the next 4 bytes of XOR key from the first output stream block. This can also be observed by encrypting and decrypting null bytes:

 >>> crypto.encrypt('\x00' * 4)
'?\\-\xdc'
>>> crypto.decrypt('\x00' * 4)
'\x16`\x05p'

If this were to work the way you wanted, the result of both of those operations should be the same. Instead, you can see the first four bytes of the 16 byte block in the first result, and the second four bytes in the second result.

After you've used up the 16 byte block of XOR key by performing four operations on four-byte values (for a 16 byte total), a new block of XOR key is generated. The first four bytes (as well as all the others) of each XOR key block are the same, so when you call decrypt this time, it gives you back the plaintext.

This is really bad! You should not use AES-CTR this way - it's equivalent to simple XOR encryption with a 16 byte repeating key, which can be broken pretty easily.

Solution

You have to reset the state of the cipher before performing an operation on a new stream of data (or another operation on it), as the original instance will no longer be in the correct initial state. Your issue will be solved by instantiating a new crypto object for the decryption, as well as resetting the counter and keystream position.

You also need to use a proper counter function that combines a nonce with a counter value that increases each time a new block of keystream is generated. PyCrypto has a Counter class that can do this for you.

from Crypto.Cipher import AES
from Crypto.Util import Counter
from Crypto import Random

# Set up the counter with a nonce.
# 64 bit nonce + 64 bit counter = 128 bit output
nonce = Random.get_random_bytes(8)
countf = Counter.new(64, nonce) 

key = Random.get_random_bytes(32)  # 256 bits key

# Instantiate a crypto object first for encryption
encrypto = AES.new(key, AES.MODE_CTR, counter=countf)
encrypted = encrypto.encrypt("asdk")

# Reset counter and instantiate a new crypto object for decryption
countf = Counter.new(64, nonce)
decrypto = AES.new(key, AES.MODE_CTR, counter=countf)
print decrypto.decrypt(encrypted) # prints "asdk"

回答3:

Start with a new crypto object for new operations

The reason why this behaves as you described in the question is because your plain text (4 bytes / 32 bits) is four times as small as the size the cryptographic engine works on for your chosen AES mode (128 bits) and also reusing the same instance of the crypto object. Simply don't reuse the same object if you're performing an operation on a new stream of data (or another operation on it). Your issue will be solved by instantiating a new crypto object for the decryption, like this:

# *NEVER* USE A FIXED LIKE COUNTER BELOW IN PRODUCTION CODE. READ THE DOCS.
counter = os.urandom(16)
key = os.urandom(32)  # 256 bits key

# Instantiate a crypto object first for encryption
encrypto = AES.new(key, AES.MODE_CTR, counter=lambda: counter)
encrypted = encrypto.encrypt("asdk")

# Instantiate a new crypto object for decryption
decrypto = AES.new(key, AES.MODE_CTR, counter=lambda: counter)
print decrypto.decrypt(encrypted) # prints "asdk"

Why it is not about padding with AES-CTR

This answer started out as a response on the answer by Marcus, in which he initially indicated the use of padding would solve it. While I understand it looks like symptoms of a padding issue, it certainly is not.

The whole point of AES-CTR is that you do not need padding, as it's a stream cipher (unlike ECB/CBC and so on)! Stream ciphers work on streams of data, rather chunking data in blocks and chaining them in the actual cryptographic computation.

回答4:

In addition to what Marcus says, the Crypto.Util.Counter class can be used to build your counter block function.

来源：https://stackoverflow.com/questions/12691168/how-aes-in-ctr-works-for-python-with-pycrypto

标签

python

encryption

cryptography

aes

pycrypto