Why does deleting a global variable named __builtins__ prevent only the REPL from accessing builtins?

左心房为你撑大大i 提交于 2019-12-11 01:17:32

问题


I have a python script with the following contents:

# foo.py

__builtins__ = 3
del __builtins__

print(int)  # <- this still works

Curiously, executing this script with the -i flag prevents only the REPL from accessing builtins:

aran-fey@starlight ~> python3 -i foo.py 
<class 'int'>
>>> print(int)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'print' is not defined

How come the script can access builtins, but the REPL can't?


回答1:


CPython doesn't look up __builtins__ every time it needs to do a built-in variable lookup. Each frame object has an f_builtins member holding its built-in variable dict, and built-in variable lookup goes through there.

f_builtins is set on frame object creation. If a new frame has no parent frame (f_back), or a different global variable dict from its parent frame, then frame object initialization looks up __builtins__ to set f_builtins. (If the new frame shares a global dict with its parent frame, then it inherits its parent's f_builtins.) This is the only way __builtins__ is involved in built-in variable lookup. You can see the code that handles this in _PyFrame_New_NoTrack.

When you delete __builtins__ inside a script, that doesn't affect f_builtins. The rest of the code executing in the script's stack frame still sees builtins. Once the script completes and -i drops you into interactive mode, every interactive command gets a new stack frame (with no parent), and the __builtins__ lookup is repeated. This is when the deleted __builtins__ finally matter.




回答2:


The execution context is different. Within the REPL we are working line-by-line (Read, Eval, Print, Loop), which allows an opportunity for global execution scope to change in between each step. But the runtime executing a module is to load the modules code, and then exec it within a scope.

In CPython, the builtins namespace associated with the execution of a code block is found by looking up the name __builtins__ in the global namespace; this should be bound to a dictionary or a module (in the latter case the module's dictionary is used). When in the __main__ module, __builtins__ is the built-in module builtins, otherwise __builtins__ is bound to the dictionary of the builtins module itself. In both contexts of your question, we are in the __main__ module.

What's important is that CPython only looks up the builtins once, right before it begins executing your code. In the REPL, this happens every time a new statement is executed. But when executing a python script, the entire content of the script is one single unit. That is why deleting the builtins in the middle of the script has no effect.

To more closely replicate that context inside a REPL, you would not enter the code of the module line by line, but instead use a compound statement:

>>> if 1:
...     del __builtins__
...     print(123)
... 
123
>>> print(123)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'print' is not defined

Naturally, you're probably now wondering how to remove builtins from within a script. The answer should be obvious: you can't do it by rebinding a name, but you can do it by mutation:

# foo2.py
__builtins__.__dict__.clear()
print(int)  # <- NameError: name 'print' is not defined

As a final note, the fact that __builtins__ name is bound at all is implementation detail of CPython and that is explicitly documented:

Users should not touch __builtins__; it is strictly an implementation detail.

Don't rely on __builtins__ for anything serious, if you need access to that scope the correct way is to import builtins and go from there.



来源:https://stackoverflow.com/questions/52221983/why-does-deleting-a-global-variable-named-builtins-prevent-only-the-repl-fro

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!