This is my second day of learning python (I know the basics of C++ and some OOP.), and I have some slight confusion regarding variables in python.
Here is how I unde
The way I see it there are different views of a language.
From the language lawyer perspective python variables always "point at" an object. However unlike Java and C++ the behvaiour of == <= >= etc depends on the runtime type of the objects that the variables point at. Furthermore in python memory management is handled by the language.
From a practical programmer perspective we can treat the fact that integers, strings, tuples etc are immutable* objects rather than straight values as an irrelevent detail. The exception is when storing large ammounts of numeric data we may want to use types that can store the values directly (e.g. numpy arrays) rather than types that will end up with an array full of references to tiny objects.
From an implementers perspective most languages have some sort of as-if rule such that if the specified behaviours are correct the implementation is correct regardless of how things are actually done under the hood.
So yes your explanation is correct from a language lawyer perspective. Your book is correct from a practical programmer perspective. What an implementation actually does depends on the implementation. In cpython integers are real objects though small value integers are taken from a cache pool rather than created anew. I'm not sure what the other implementations (e.g. pypy and jython) do.
* note the distinction between mutable and immutable objects here. With a mutable object we have to be careful about treating it "like a value" because some other code might mutate it. With an immutable object we have no such concerns.
It is correct you can more or less thing of variables as pointers. However example code would help greatly with explaining how this actually is working.
First, we will heavily utilize the id function:
Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.
It's likely this will return different absolute values on your machine.
Consider this example:
>>> foo = 'a string'
>>> id(foo)
4565302640
>>> bar = 'a different string'
>>> id(bar)
4565321816
>>> bar = foo
>>> id(bar) == id(foo)
True
>>> id(bar)
4565302640
You can see that:
when we change the value of foo, it is assigned to a different id:
>>> foo = 42
>>> id(foo)
4561661488
>>> foo = 'oh no'
>>> id(foo)
4565257832
An interesting observation too is that integers implicitly have this functionality up to 256:
>>> a = 100
>>> b = 100
>>> c = 100
>>> id(a) == id(b) == id(c)
True
However beyond 256 this is no longer true:
>>> a = 256
>>> b = 256
>>> id(a) == id(b)
True
>>> a = 257
>>> b = 257
>>> id(a) == id(b)
False
however assigning a
to b
will indeed keep the id the same as shown before:
>>> a = b
>>> id(a) == id(b)
True
In Python, a variable holds the reference to the object. An object is a chunk of allocated memory that holds a value and a header. Object's header contains its type and a reference counter that denotes the amount of times this object is referenced in the source code so that Garbage Collection can identify whether an object can be collected.
Now when you assign values to a variable, Python actually assigns references which are pointers to memory locations allocated to objects:
# x holds a reference to the memory location allocated for
# the object(type=string, value="Hello World", refCounter=1)
x = "Hello World"
Now when you assign objects of different type to the same variable, you actually change the reference so that it points to a different object (i.e. different memory location). By the time you assign a different reference (and thus object) to a variable, the Garbage Collector will immediately reclaim the space allocated to the previous object, assuming that it is not being referenced by any other variable in the source code:
# x holds a reference to the memory location allocated for
# the object(type=string, value="Hello World", refCounter=1)
x = "Hello World"
# Now x holds the reference to a different object(type=int, value=10, refCounter=1)
# and object(type=string, value="Hello World", refCounter=0) -which is not refereced elsewhere
# will now be garbage-collected.
x = 10
Coming to your example now,
spam
holds the reference to object(type=int, value=42, refCounter=1):
>>> spam = 42
Now cheese
will also hold the reference to object(type=int, value=42, refCounter=2)
>>> cheese = spam
Now spam holds a reference to a different object(type=int, value=100, refCounter=1)
>>> spam = 100
>>> spam
100
But cheese will keep pointing to object(type=int, value=42, refCounter=1)
>>> cheese
42
Python is neither pass-by-reference or pass-by-value. Python variables are not pointers, they are not references, they are not values. Python variables are names.
Think of it as "pass-by-alias" if you need the same phrase type, or possibly "pass-by-object", because you can mutate the same object from any variable that indicates it, if it's mutable, but reassignment of a variable (alias) only changes that one variable.
If it helps: C variables are boxes that you write values into. Python names are tags that you put on values.
A Python variable's name is a key in the global (or local) namespace, which is effectively a dictionary. The underlying value is some object in memory. Assignment gives a name to that object. Assignment of one variable to another variable means both variables are names for the same object. Re-assignment of one variable changes what object is named by that variable without changing the other variable. You've moved the tag but not changed the previous object or any other tags on it.
In the underlying C code of the CPython implementation, every Python object is a PyObject*
, so you can think of it as working like C if you only ever had pointers to data (no pointers-to-pointers, no directly-passed values).
you could say that Python is pass-by-value, where the values are pointers… or you could say Python is pass-by-reference, where the references are copies.
When you store spam = 42
, it creates an object in the memory. Then you assign cheese = spam
, It assigns the object referenced by spam
to cheese
. And finally, when you change spam = 100
, it changes only spam
object. So cheese = 42
.
What is happening in spam = 100
line is replacement of previous value (pointer to object of type int
with value 42
) with another pointer to another object (type int
, value 100
)