Why does the `is` operator behave differently in a script vs the REPL?

后端未结

关注

 2  905

In python, two codes have different results:

a = 300
b = 300
print (a==b)
print (a is b)      ## print True
print (\"id(a) = %d, id(b) = %d\"%(id(a), id(b)))


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  被撕碎了的回忆        
                
              
                            
                2020-11-27 23:47
              
            
            
                                                                       
There are actually two things to know about CPython and its behavior here.
First, small integers in the range of [-5, 256] are interned internally.
So any value falling in that range will share the same id, even at the REPL:

>>> a = 100
>>> b = 100
>>> a is b
True


Since 300 > 256, it's not being interned:

>>> a = 300
>>> b = 300
>>> a is b
False


Second, is that in a script, literals are put into a constant section of the
compiled code.  Python is smart enough to realize that since both a and b
refer to the literal 300 and that 300 is an immutable object, it can just
go ahead and reference the same constant location.  If you tweak your script
a bit and write it as:

def foo():
    a = 300
    b = 300
    print(a==b)
    print(a is b)
    print("id(a) = %d, id(b) = %d" % (id(a), id(b)))


import dis
dis.disassemble(foo.__code__)


The beginning part of the output looks like this:

2           0 LOAD_CONST               1 (300)
            2 STORE_FAST               0 (a)

3           4 LOAD_CONST               1 (300)
            6 STORE_FAST               1 (b)

...


As you can see, CPython is loading the a and b using the same constant slot.
This means that a and b are now referring to the same object (because they
reference the same slot) and that is why a is b is True in the script but
not at the REPL.

You can see this behavior in the REPL too, if you wrap your statements in a function:

>>> import dis
>>> def foo():
...   a = 300
...   b = 300
...   print(a==b)
...   print(a is b)
...   print("id(a) = %d, id(b) = %d" % (id(a), id(b)))
...
>>> foo()
True
True
id(a) = 4369383056, id(b) = 4369383056
>>> dis.disassemble(foo.__code__)
  2           0 LOAD_CONST               1 (300)
              2 STORE_FAST               0 (a)

  3           4 LOAD_CONST               1 (300)
              6 STORE_FAST               1 (b)
# snipped...


Bottom line: while CPython makes these optimizations at times, you shouldn't really count on it--it's really an implementation detail, and one that they've changed over time (CPython used to only do this for integers up to 100, for example).  If you're comparing numbers, use ==. :-)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  时光说笑        
                
              
                            
                2020-11-27 23:48
              
            
            
                                                                       
When you run code in a .py script, the entire file is compiled into a code object before executing it. In this case, CPython is able to make certain optimizations - like reusing the same instance for the integer 300.

You could also reproduce that in the REPL, by executing code in a context more closely resembling the execution of a script:

>>> source = """\ 
... a = 300 
... b = 300 
... print (a==b) 
... print (a is b)## print True 
... print ("id(a) = %d, id(b) = %d"%(id(a), id(b))) ## They have same address 
... """
>>> code_obj = compile(source, filename="myscript.py", mode="exec")
>>> exec(code_obj) 
True
True
id(a) = 140736953597776, id(b) = 140736953597776


Some of these optimizations are pretty aggressive. You could modify the script line b = 300 changing it to b = 150 + 150, and CPython would still "fold" b into the same constant. If you're interested in such implementation details, look in peephole.c and Ctrl+F for PyCode_Optimize and any info about the "consts table".

In contrast, when you run code line-by-line directly in the REPL it executes in a different context. Each line is compiled in "single" mode and this optimization is not available.

>>> scope = {} 
>>> lines = source.splitlines()
>>> for line in lines: 
...     code_obj = compile(line, filename="<I'm in the REPL, yo!>", mode="single")
...     exec(code_obj, scope) 
...
True
False
id(a) = 140737087176016, id(b) = 140737087176080
>>> scope['a'], scope['b']
(300, 300)
>>> id(scope['a']), id(scope['b'])
(140737087176016, 140737087176080)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复