(This is in response to code that has now been edited out of the initial question) You forgot to call the functions in the second case. Making the appropriate modifications, the results are as expected:
test1 = """
def foo1():
my_set1 = set((1, 2, 3))
foo1()
"""
timeit(test1)
# 0.48808742000255734
test2 = """
def foo2():
my_set2 = {1,2,3}
foo2()
"""
timeit(test2)
# 0.3064506609807722
Now, the reason for the difference in timings is because set()
is a function call requiring a lookup into the symbol table, whereas the {...}
set construction is an artefact of the syntax, and is much faster.
The difference is obvious when observing the disassembled byte code.
import dis
dis.dis("set((1, 2, 3))")
1 0 LOAD_NAME 0 (set)
2 LOAD_CONST 3 ((1, 2, 3))
4 CALL_FUNCTION 1
6 RETURN_VALUE
dis.dis("{1, 2, 3}")
1 0 LOAD_CONST 0 (1)
2 LOAD_CONST 1 (2)
4 LOAD_CONST 2 (3)
6 BUILD_SET 3
8 RETURN_VALUE
In the first case, a function call is made by the instruction CALL_FUNCTION
on the tuple (1, 2, 3)
(which also comes with its own overhead, although minor—it is loaded as a constant via LOAD_CONST
), whereas in the second instruction is just a BUILD_SET
call, which is more efficient.
Re: your question regarding the time taken for tuple construction, we see this is actually negligible:
timeit("""(1, 2, 3)""")
# 0.01858693000394851
timeit("""{1, 2, 3}""")
# 0.11971827200613916
Tuples are immutable, so the compiler optimises this operation by loading it as a constant—this is called constant folding (you can see this clearly from the LOAD_CONST
instruction above), so the time taken is negligible. This is not seen with sets are they are mutable (Thanks to @user2357112 for pointing this out).
For larger sequences, we see similar behaviour. {..}
syntax is faster at constructing sets using set comprehensions as opposed to set()
which has to build the set from a generator.
timeit("""set(i for i in range(10000))""", number=1000)
# 0.9775058150407858
timeit("""{i for i in range(10000)}""", number=1000)
# 0.5508635920123197
For reference, you can also use iterable unpacking on more recent versions:
timeit("""{*range(10000)}""", number=1000)
# 0.7462548640323803
Interestingly, however, set()
is faster when called directly on range
:
timeit("""set(range(10000))""", number=1000)
# 0.3746800610097125
This happens to be faster than the set construction. You will see similar behaviour for other sequences (such as list
s).
My recommendation would be to use the {...}
set comprehension when constructing set literals, and as an alternative to passing a generator comprehension to set()
; and instead use set()
to convert an existing sequence/iterable to a set.