I wrote the following code to check if integers are passed by value or reference.
foo = 1
def f(bar):
print id(foo) == id(bar)
bar += 1
print foo, b
There seems to be a lot of confusion around this "issue". Variables names in Python are actually all references to objects. Assignements to variable names aren't actually changing the objects themselves, but setting the reference to a new object. So in your case:
foo = 1 #
def test(bar):
# At this point, "bar" points to the same object as foo.
bar = 2 # We're updating the name "bar" to point an object "int(2)".
# 'foo' still points to its original object, "int(1)".
print foo, bar # Therefore we're showing two different things.
test(foo)
The way Python's syntax resembles C and the fact many things are syntactic sugar can be confusing. Remembering that integer objects are acually immutable, and it seems weird that foo += 1
could be a valid statement. In actuality, foo += 1
is actually equivalent to foo = foo + 1
, both of which translate to foo = foo.__add__(1)
, which actually returns a new object, as shown here:
>>> a = 1
>>> id (a)
18613048
>>> a += 1
>>> id(a)
18613024
>>>
The following happens:
print id(foo) == id(bar)
The identity is the same. print foo is bar
would have yielded the same, BTW.
bar += 1
This is translated to:
bar = bar.__iadd__(1)
And only if this does not work or does not exist, it calls:
bar = bar.__add__(1)
(I omit the case that bar = 1.__radd__(bar)
could as well be called.)
As bar
refers to a number, which is immutable, a different object is returned instead, so that bar
refers to 2
now, leaving foo
untouched.
If you do any of
print id(foo) == id(bar)
print foo is bar
now, you see that they now point to different objects.
In Python 2, the current implementation keeps an array of integer objects for all integers between -5 and 256. So if you assign a variable as var1 = 1
and some other variable as var2 = 1
, they are both pointing to the same object.
Python "variables" are labels that point to objects, rather than containers that can be filled with data, and thus on reassignment, your label is pointing to a new object (rather than the original object containing the new data). See Stack Overflow question Python identity: Multiple personality disorder, need code shrink.
Coming to your code, I have introduced a couple more print statements, which will display that the variables are being passed by value
foo = 1
def f(bar):
print id(foo) == id(bar)
print id(1), id(foo), id(bar) #All three are same
print foo, bar
bar += 1
print id(bar), id(2)
print foo, bar
f(foo)
In Python, like in many modern OO languages
foo = 1
actually creates an object with the value 1
and assigns a reference to the alias foo
. The internal type of foo is PyIntObject. This means Python isn't using the CPU / hardware int type, it always uses objects to handle numbers internally. The correct term is "plain integer", btw.
But creating objects is very expensive. That's why Python keeps an internal cache for a few numbers. Which means:
foo = 1
bar = 1
assert id(foo) == id(bar)
This isn't guaranteed, it's just a side effect of the implementation.
Number types in Python are also immutable. So even though bar
in your example is an alias for the cached int number, changing bar
doesn't modify the internal value. Instead, bar
is pointed to another instance which is why the id changes.
Because of the aforementioned optimization, this works as well:
foo = 1
bar = 1
assert id(foo) == id(bar)
bar += 1
assert id(foo) != id(bar)
bar -= 1
assert id(foo) == id(bar)