This is for Python 2.6.
I could not figure out why a and b are identical:
>>> a = \"some_string\"
>>> b = \"some_string\"
>>>
This should actually be more of a comment to Gleen's answer but I can not do comments yet. I've run some tests directly on the Python interpreter and I saw some interesting behavior. According to Glenn, the interpreter treats entries as separate "files" and they don't share a string table when stored for future reference. Here is what I run:
>>> a="some_string"
>>> b="some_string"
>>> id(a)
2146597048
>>> id(b)
2146597048
>>> a="some string"
>>> b="some string"
>>> id(a)
2146597128
>>> id(b)
2146597088
>>> c="some string" <-----(1)
>>> d="some string"
>>> id(c)
2146597208 <-----(1)
>>> a="some_string"
>>> b="some_string"
>>> id(a)
2146597248 <---- waited a few minutes
>>> c="some_string"
>>> d="some_string"
>>> id(d)
2146597248 <---- still same id after a few min
>>> b="some string"
>>> id(b)
2146597288
>>> b="some_string" <---(2)
>>> id(b)
2146597248 <---(2)
>>> a="some"
>>> b="some"
>>> c="some"
>>> d="some" <---(2) lost all references
>>> id(a)
2146601728
>>> a="some_string" <---(2)
>>> id(a)
2146597248 <---(2) returns same old one after mere seconds
>>> a="some"
>>> id(a)
2146601728 <---(2) Waited a few minutes
>>> a="some_string" <---- (1)
>>> id(a)
2146597208 <---- (1) Reused a "different" id after a few minutes
It seems that some of the id references might be reused after the initial references are lost and no longer "in use" (1), but it might also be related to the time those id references are not being used, as you can see in what i have marked as number (2), giving different id references depending on how long that id has not been used. I just find it curious and thought of posting it.
TIM PETERS SAID: Sorry, the only bug I see here is in the code you posted using "is" to try to determine whether two strings are equal. "is" tests for object identity, not for equality, and whether two immutable objects are really the same object isn't in general defined by Python. You should use "==" to check two strings for equality. The only time it's reliable to use "is" for that purpose is when you've explicitly interned all strings being compared (via using the intern() builtin function).
from here: http://mail.python.org/pipermail/python-bugs-list/2004-December/026772.html
Python may or may not automatically intern strings, which determines whether future instances of the string will share a reference.
If it decides to intern a string, then both will refer to the same string instance. If it doesn't, it'll create two separate strings that happen to have the same contents.
In general, you don't need to worry about whether this is happening or not; you usually want to check equality, a == b
, not whether they're the same object, a is b
.