Trouble with encode + encrypt + pad using same code for python2 and python3

前端 未结 4 2065
渐次进展
渐次进展 2021-01-19 04:07

Disclaimer: I understand that the following is not suited to give "security" in a production environment. It is simply meant as "a little bit

4条回答
  •  醉梦人生
    2021-01-19 04:37

    Generally, code that handles binary data properly in both Python 2 and Python 3 can get a little messy. As you discovered, when you iterate over a bytes string in Python 3 you get integers, not characters.

    Thus in Python 2, this code

    print([i for i in b'ABCDE'])
    print([ord(c) for c in 'ABCDE'])
    

    outputs

    ['A', 'B', 'C', 'D', 'E']
    [65, 66, 67, 68, 69]
    

    whereas in Python 3 it outputs

    [65, 66, 67, 68, 69]
    [65, 66, 67, 68, 69]
    

    The clean way to handle this is to simply write separate code for the two versions. But it is possible to write code that works on both versions.

    Here's a modified version of the code you posted in the question. It also handles the statefulness of AES by creating a new AES cipher object each time you encrypt or decrypt.

    from __future__ import print_function
    from Crypto.Cipher import AES
    import base64
    
    BS = 16
    
    def pad(s):
        padsize = BS - len(s) % BS
        return (s + padsize * chr(padsize)).encode('utf-8')
    
    def unpad(s):
        s = s.decode('utf-8')
        offset = ord(s[-1])
        return s[:-offset]
    
    def scramble(data, key, iv):
        crypto = AES.new(key, AES.MODE_CBC, iv)
        raw = crypto.encrypt(pad(data))
        return base64.b64encode(raw)
    
    def unscramble(data, key, iv):
        crypto = AES.new(key, AES.MODE_CBC, iv)
        raw = crypto.decrypt(base64.b64decode(data))
        return unpad(raw)
    
    key = b'This is a key123'
    iv = b'This is an IV456'
    
    incoming = "abc def ghi jkl mno"
    print("in: {0!r}".format(incoming))
    
    scrambled1 = scramble(incoming, key, iv)
    print("scrambled: {0!r}".format(scrambled1))
    
    incoming = "pqr stu vwx yz0 123"
    print("in: {0!r}".format(incoming))
    
    scrambled2 = scramble(incoming, key, iv)
    print("scrambled: {0!r}".format(scrambled2))
    
    andback = unscramble(scrambled2, key, iv)
    print("reversed : {0!r}".format(andback))
    
    andback = unscramble(scrambled1, key, iv)
    print("reversed : {0!r}".format(andback))
    

    Python 3 output

    in: 'abc def ghi jkl mno'
    scrambled: b'C2jA5/WngDo55J7TG3uiArEO7hhyTPld/A3v52t+ANc='
    in: 'pqr stu vwx yz0 123'
    scrambled: b'FsFAKA2SbhCTimURy0W8+tM4iqLhNlK3OZrRuuYpMpY='
    reversed : 'pqr stu vwx yz0 123'
    reversed : 'abc def ghi jkl mno'
    

    In Python 2, the reversed output looks like

    reversed : u'pqr stu vwx yz0 123'
    reversed : u'abc def ghi jkl mno'
    

    because we're decoding the bytes to Unicode.


    I turned the pad and unpad functions into proper def functions. That makes them a little easier to read. Also, it's generally considered bad style to use lambda for named functions: lambda is supposed to be used for anonymous functions.

提交回复
热议问题