Why does the `is` operator behave differently in a script vs the REPL?

后端 未结 2 905
忘了有多久
忘了有多久 2020-11-27 22:55

In python, two codes have different results:

a = 300
b = 300
print (a==b)
print (a is b)      ## print True
print (\"id(a) = %d, id(b) = %d\"%(id(a), id(b)))         


        
相关标签:
2条回答
  • 2020-11-27 23:47

    There are actually two things to know about CPython and its behavior here. First, small integers in the range of [-5, 256] are interned internally. So any value falling in that range will share the same id, even at the REPL:

    >>> a = 100
    >>> b = 100
    >>> a is b
    True
    

    Since 300 > 256, it's not being interned:

    >>> a = 300
    >>> b = 300
    >>> a is b
    False
    

    Second, is that in a script, literals are put into a constant section of the compiled code. Python is smart enough to realize that since both a and b refer to the literal 300 and that 300 is an immutable object, it can just go ahead and reference the same constant location. If you tweak your script a bit and write it as:

    def foo():
        a = 300
        b = 300
        print(a==b)
        print(a is b)
        print("id(a) = %d, id(b) = %d" % (id(a), id(b)))
    
    
    import dis
    dis.disassemble(foo.__code__)
    

    The beginning part of the output looks like this:

    2           0 LOAD_CONST               1 (300)
                2 STORE_FAST               0 (a)
    
    3           4 LOAD_CONST               1 (300)
                6 STORE_FAST               1 (b)
    
    ...
    

    As you can see, CPython is loading the a and b using the same constant slot. This means that a and b are now referring to the same object (because they reference the same slot) and that is why a is b is True in the script but not at the REPL.

    You can see this behavior in the REPL too, if you wrap your statements in a function:

    >>> import dis
    >>> def foo():
    ...   a = 300
    ...   b = 300
    ...   print(a==b)
    ...   print(a is b)
    ...   print("id(a) = %d, id(b) = %d" % (id(a), id(b)))
    ...
    >>> foo()
    True
    True
    id(a) = 4369383056, id(b) = 4369383056
    >>> dis.disassemble(foo.__code__)
      2           0 LOAD_CONST               1 (300)
                  2 STORE_FAST               0 (a)
    
      3           4 LOAD_CONST               1 (300)
                  6 STORE_FAST               1 (b)
    # snipped...
    

    Bottom line: while CPython makes these optimizations at times, you shouldn't really count on it--it's really an implementation detail, and one that they've changed over time (CPython used to only do this for integers up to 100, for example). If you're comparing numbers, use ==. :-)

    0 讨论(0)
  • 2020-11-27 23:48

    When you run code in a .py script, the entire file is compiled into a code object before executing it. In this case, CPython is able to make certain optimizations - like reusing the same instance for the integer 300.

    You could also reproduce that in the REPL, by executing code in a context more closely resembling the execution of a script:

    >>> source = """\ 
    ... a = 300 
    ... b = 300 
    ... print (a==b) 
    ... print (a is b)## print True 
    ... print ("id(a) = %d, id(b) = %d"%(id(a), id(b))) ## They have same address 
    ... """
    >>> code_obj = compile(source, filename="myscript.py", mode="exec")
    >>> exec(code_obj) 
    True
    True
    id(a) = 140736953597776, id(b) = 140736953597776
    

    Some of these optimizations are pretty aggressive. You could modify the script line b = 300 changing it to b = 150 + 150, and CPython would still "fold" b into the same constant. If you're interested in such implementation details, look in peephole.c and Ctrl+F for PyCode_Optimize and any info about the "consts table".

    In contrast, when you run code line-by-line directly in the REPL it executes in a different context. Each line is compiled in "single" mode and this optimization is not available.

    >>> scope = {} 
    >>> lines = source.splitlines()
    >>> for line in lines: 
    ...     code_obj = compile(line, filename="<I'm in the REPL, yo!>", mode="single")
    ...     exec(code_obj, scope) 
    ...
    True
    False
    id(a) = 140737087176016, id(b) = 140737087176080
    >>> scope['a'], scope['b']
    (300, 300)
    >>> id(scope['a']), id(scope['b'])
    (140737087176016, 140737087176080)
    
    0 讨论(0)
提交回复
热议问题