'{0}'.format() is faster than str() and '{}'.format() using IPython %timeit and otherwise using pure Python

前端 未结 1 658
后悔当初
后悔当初 2021-02-07 12:18

So it\'s a CPython thing, not quite sure that it has same behaviour with other implementations.

But \'{0}\'.format() is faster than str() and

1条回答
  •  灰色年华
    2021-02-07 12:43

    The IPython timing is just off for some reason (though, when tested with a longer format string in different cells, it behaved slightly better). Maybe executing in the same cells isn't right, don't really know.

    Either way, "{}" is a bit faster than "{pos}" which is faster than "{name}" while they're all slower than str.

    str(val) is the fastest way to transform an object to str; it directly calls the objects' __str__, if one exists, and returns the resulting string. Others, like format, (or str.format) include additional overhead due to an extra function call (to format itself); handling any arguments, parsing the format string and then invoking the __str__ of their args.

    For the str.format methods "{}" uses automatic numbering; from a small section in the docs on the format syntax:

    Changed in version 3.1: The positional argument specifiers can be omitted, so '{} {}' is equivalent to '{0} {1}'.

    that is, if you supply a string of the form:

    "{}{}{}".format(1, 2, 3)
    

    CPython immediately knows that this is equivalent to:

    "{0}{1}{2}".format(1, 2, 3)
    

    With a format string that contains numbers indicating positions; CPython can't assume a strictly increasing number (that starts from 0) and must parse every single bracket in order to get it right, slowing things down a bit in the process:

    "{1}{2}{0}".format(1, 2, 3)
    

    That's why it also is not allowed to mix these two together:

    "{1}{}{2}".format(1, 2, 3)
    

    you'll get a nice ValueError back when you attempt to do so:

    ValueError: cannot switch from automatic field numbering to manual field specification
    

    it also grabs these positionals with PySequence_GetItem which I'm pretty sure is fast, at least, in comparison to PyObject_GetItem [see next].

    For "{name}" values, CPython always has extra work to do due to the fact that we're dealing with keyword arguments rather than positional ones; this includes things like building the dictionary for the calls and generating way more LOAD byte-code instructions for loading keys and values. The keyword form of function calling always introduces some overhead. In addition, it seems that the grabbing actually uses PyObject_GetItem which incurs some extra overhead due to its generic nature.

    0 讨论(0)
提交回复
热议问题