Python 2.6 introduced the str.format() method with a slightly different syntax from the existing %
operator. Which is better and for what situations?
Pyt
%
gives better performance than format
from my test.
Test code:
Python 2.7.2:
import timeit
print 'format:', timeit.timeit("'{}{}{}'.format(1, 1.23, 'hello')")
print '%:', timeit.timeit("'%s%s%s' % (1, 1.23, 'hello')")
Result:
> format: 0.470329046249
> %: 0.357107877731
Python 3.5.2
import timeit
print('format:', timeit.timeit("'{}{}{}'.format(1, 1.23, 'hello')"))
print('%:', timeit.timeit("'%s%s%s' % (1, 1.23, 'hello')"))
Result
> format: 0.5864730989560485
> %: 0.013593495357781649
It looks in Python2, the difference is small whereas in Python3, %
is much faster than format
.
Thanks @Chris Cogdon for the sample code.
Edit 1:
Tested again in Python 3.7.2 in July 2019.
Result:
> format: 0.86600608
> %: 0.630180146
There is not much difference. I guess Python is improving gradually.
Edit 2:
After someone mentioned python 3's f-string in comment, I did a test for the following code under python 3.7.2 :
import timeit
print('format:', timeit.timeit("'{}{}{}'.format(1, 1.23, 'hello')"))
print('%:', timeit.timeit("'%s%s%s' % (1, 1.23, 'hello')"))
print('f-string:', timeit.timeit("f'{1}{1.23}{\"hello\"}'"))
Result:
format: 0.8331376779999999
%: 0.6314778750000001
f-string: 0.766649943
It seems f-string is still slower than %
but better than format
.
One situation where %
may help is when you are formatting regex expressions. For example,
'{type_names} [a-z]{2}'.format(type_names='triangle|square')
raises IndexError
. In this situation, you can use:
'%(type_names)s [a-z]{2}' % {'type_names': 'triangle|square'}
This avoids writing the regex as '{type_names} [a-z]{{2}}'
. This can be useful when you have two regexes, where one is used alone without format, but the concatenation of both is formatted.
PEP 3101 proposes the replacement of the %
operator with the new, advanced string formatting in Python 3, where it would be the default.
Assuming you're using Python's logging
module, you can pass the string formatting arguments as arguments to the .debug()
method rather than doing the formatting yourself:
log.debug("some debug info: %s", some_info)
which avoids doing the formatting unless the logger actually logs something.
I would add that since version 3.6, we can use fstrings like the following
foo = "john"
bar = "smith"
print(f"My name is {foo} {bar}")
Which give
My name is john smith
Everything is converted to strings
mylist = ["foo", "bar"]
print(f"mylist = {mylist}")
Result:
mylist = ['foo', 'bar']
you can pass function, like in others formats method
print(f'Hello, here is the date : {time.strftime("%d/%m/%Y")}')
Giving for example
Hello, here is the date : 16/04/2018
To answer your first question... .format
just seems more sophisticated in many ways. An annoying thing about %
is also how it can either take a variable or a tuple. You'd think the following would always work:
"hi there %s" % name
yet, if name
happens to be (1, 2, 3)
, it will throw a TypeError
. To guarantee that it always prints, you'd need to do
"hi there %s" % (name,) # supply the single argument as a single-item tuple
which is just ugly. .format
doesn't have those issues. Also in the second example you gave, the .format
example is much cleaner looking.
Why would you not use it?
To answer your second question, string formatting happens at the same time as any other operation - when the string formatting expression is evaluated. And Python, not being a lazy language, evaluates expressions before calling functions, so in your log.debug
example, the expression "some debug info: %s"%some_info
will first evaluate to, e.g. "some debug info: roflcopters are active"
, then that string will be passed to log.debug()
.