Convert bytes to a string

后端 未结 19 2309
野性不改
野性不改 2020-11-21 04:45

I\'m using this code to get standard output from an external program:

>>> from subprocess import *
>>> command_stdout = Popen([\'ls\', \'-l         


        
相关标签:
19条回答
  • 2020-11-21 04:52

    I think this way is easy:

    >>> bytes_data = [112, 52, 52]
    >>> "".join(map(chr, bytes_data))
    'p44'
    
    0 讨论(0)
  • 2020-11-21 04:54

    You need to decode the bytes object to produce a string:

    >>> b"abcde"
    b'abcde'
    
    # utf-8 is used here because it is a very common encoding, but you
    # need to use the encoding your data is actually in.
    >>> b"abcde".decode("utf-8") 
    'abcde'
    
    0 讨论(0)
  • 2020-11-21 04:54

    I think you actually want this:

    >>> from subprocess import *
    >>> command_stdout = Popen(['ls', '-l'], stdout=PIPE).communicate()[0]
    >>> command_text = command_stdout.decode(encoding='windows-1252')
    

    Aaron's answer was correct, except that you need to know which encoding to use. And I believe that Windows uses 'windows-1252'. It will only matter if you have some unusual (non-ASCII) characters in your content, but then it will make a difference.

    By the way, the fact that it does matter is the reason that Python moved to using two different types for binary and text data: it can't convert magically between them, because it doesn't know the encoding unless you tell it! The only way YOU would know is to read the Windows documentation (or read it here).

    0 讨论(0)
  • 2020-11-21 04:55

    You need to decode the byte string and turn it in to a character (Unicode) string.

    On Python 2

    encoding = 'utf-8'
    'hello'.decode(encoding)
    

    or

    unicode('hello', encoding)
    

    On Python 3

    encoding = 'utf-8'
    b'hello'.decode(encoding)
    

    or

    str(b'hello', encoding)
    
    0 讨论(0)
  • 2020-11-21 04:55

    In Python 3, the default encoding is "utf-8", so you can directly use:

    b'hello'.decode()
    

    which is equivalent to

    b'hello'.decode(encoding="utf-8")
    

    On the other hand, in Python 2, encoding defaults to the default string encoding. Thus, you should use:

    b'hello'.decode(encoding)
    

    where encoding is the encoding you want.

    Note: support for keyword arguments was added in Python 2.7.

    0 讨论(0)
  • 2020-11-21 05:00

    When working with data from Windows systems (with \r\n line endings), my answer is

    String = Bytes.decode("utf-8").replace("\r\n", "\n")
    

    Why? Try this with a multiline Input.txt:

    Bytes = open("Input.txt", "rb").read()
    String = Bytes.decode("utf-8")
    open("Output.txt", "w").write(String)
    

    All your line endings will be doubled (to \r\r\n), leading to extra empty lines. Python's text-read functions usually normalize line endings so that strings use only \n. If you receive binary data from a Windows system, Python does not have a chance to do that. Thus,

    Bytes = open("Input.txt", "rb").read()
    String = Bytes.decode("utf-8").replace("\r\n", "\n")
    open("Output.txt", "w").write(String)
    

    will replicate your original file.

    0 讨论(0)
提交回复
热议问题