There appear to be two different ways to convert a string to bytes, as seen in the answers to TypeError: 'str' does not support the buffer interface
Which of
The absolutely best way is neither of the 2, but the 3rd. The first parameter to encode defaults to 'utf-8'
ever since Python 3.0. Thus the best way is
b = mystring.encode()
This will also be faster, because the default argument results not in the string "utf-8"
in the C code, but NULL
, which is much faster to check!
Here be some timings:
In [1]: %timeit -r 10 'abc'.encode('utf-8')
The slowest run took 38.07 times longer than the fastest.
This could mean that an intermediate result is being cached.
10000000 loops, best of 10: 183 ns per loop
In [2]: %timeit -r 10 'abc'.encode()
The slowest run took 27.34 times longer than the fastest.
This could mean that an intermediate result is being cached.
10000000 loops, best of 10: 137 ns per loop
Despite the warning the times were very stable after repeated runs - the deviation was just ~2 per cent.
Using encode()
without an argument is not Python 2 compatible, as in Python 2 the default character encoding is ASCII.
>>> 'äöä'.encode()
Traceback (most recent call last):
File "", line 1, in
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)