I hope it\'s not a duplicate (and at the same time it\'s difficult to tell, given the amount of questions with such errors, but which are basic mistakes), but I don\'t under
It's not about the compiler doing a static analysis based on unrelated branches when compiling to bytecode; it's much simpler.
Python has a rule for distinguishing global, closure, and local variables. All variables that are assigned to in the function (including parameters, which are assigned to implicitly), are local variables (unless they have a global
or nonlocal
statement). This is explained in Binding and Naming and subsequent sections in the reference documentation.
This isn't about keeping the interpreter simple, it's about keeping the rule simple enough that it's usually intuitive to human readers, and can easily be worked out by humans when it isn't intuitive. (That's especially important for cases like this—the behavior can't be intuitive everywhere, so Python keeps the rule simple enough that, once you learn it, cases like this are still obvious. But you definitely do have to learn the rule before that's true. And, of course, most people learn the rule by being surprised by it the first time…)
Even with an optimizer smart enough to completely remove any bytecode related to if False: ord=None
, ord
must still be a local variable by the rules of the language semantics.
So: there's an ord =
in your function, therefore all references to ord
are references to a local variable, not any global or nonlocal that happens to have the same name, and therefore your code is an UnboundLocalError
.
Many people get by without knowing the actual rule, and instead use an even simpler rule: a variable is
While this works for most cases, it can be a bit misleading in some cases—like this one. A language with LEGB scoping done Lisp-style would see that ord
isn't in the local namespace, and therefore return the global, but Python doesn't do that. You could say that ord
is in the local namespace, but bound to a special "undefined" value, and that's actually close to what happens under the covers, but that's not what the rules of Python say, and, while it may be more intuitive for simple cases, it's harder to reason through.
If you're curious how this works under the covers:
In CPython, the compiler scans your function to find all assignments with an identifier as a target, and stores them in an array. It removes global and nonlocal variables. This arrays ends up as your code object's co_varnames
, so let's say your ord
is co_varnames[1]
. Every use of that variable then gets compiled to a LOAD_FAST 1
or STORE_FAST 1
, instead of a LOAD_NAME
or STORE_GLOBAL
or other operation. That LOAD_FAST 1
just loads the frame's f_locals[1]
onto the stack when interpreted. That f_locals
starts off as an array of NULL pointers instead of pointers to Python objects, and if a LOAD_FAST
loads a NULL pointer, it raises UnboundLocalError
.
Just to demonstrate what's going on with the compiler:
def f():
if False:
ord = None
c = ord('a')
4 0 LOAD_FAST 0 (ord)
3 LOAD_CONST 1 ('a')
6 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
9 STORE_FAST 1 (c)
12 LOAD_CONST 0 (None)
15 RETURN_VALUE
Access to a
is using LOAD_FAST
, which is used for local variables.
If you set ord
to None outside your function, LOAD_GLOBAL
is used instead:
if False:
ord = None
def f():
c = ord('a')
4 0 LOAD_GLOBAL 0 (ord)
3 LOAD_CONST 1 ('a')
6 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
9 STORE_FAST 0 (c)
12 LOAD_CONST 0 (None)
15 RETURN_VALUE