Why is the id of a Python class not unique when called quickly?

前端 未结 6 962
花落未央
花落未央 2020-11-22 15:29

I\'m doing some things in Python (3.3.3), and I came across something that is confusing me since to my understanding classes get a new id each time they are called.

相关标签:
6条回答
  • 2020-11-22 15:59

    I sense a deeper problem here. You should not be relying on id to track unique instances over the lifetime of your program. You should simply see it as a non-guaranteed memory location indicator for the duration of each object instance. If you immediately create and release instances then you may very well create consecutive instances in the same memory location.

    Perhaps what you need to do is track a class static counter that assigns each new instance with a unique id, and increments the class static counter for the next instance.

    0 讨论(0)
  • 2020-11-22 16:02

    If you read the documentation for id, it says:

    Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.

    And that's exactly what's happening: you have two objects with non-overlapping lifetimes, because the first one is already out of scope before the second one is ever created.


    But don't trust that this will always happen, either. Especially if you need to deal with other Python implementations, or with more complicated classes. All that the language says is that these two objects may have the same id() value, not that they will. And the fact that they do depends on two implementation details:

    • The garbage collector has to clean up the first object before your code even starts to allocate the second object—which is guaranteed to happen with CPython or any other ref-counting implementation (when there are no circular references), but pretty unlikely with a generational garbage collector as in Jython or IronPython.

    • The allocator under the covers have to have a very strong preference for reusing recently-freed objects of the same type. This is true in CPython, which has multiple layers of fancy allocators on top of basic C malloc, but most of the other implementations leave a lot more to the underlying virtual machine.


    One last thing: The fact that the object.__repr__ happens to contain a substring that happens to be the same as the id as a hexadecimal number is just an implementation artifact of CPython that isn't guaranteed anywhere. According to the docs:

    If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment). If this is not possible, a string of the form <...some useful description…> should be returned.

    The fact that CPython's object happens to put hex(id(self)) (actually, I believe it's doing the equivalent of sprintf-ing its pointer through %p, but since CPython's id just returns the same pointer cast to a long that ends up being the same) isn't guaranteed anywhere. Even if it has been true since… before object even existed in the early 2.x days. You're safe to rely on it for this kind of simple "what's going on here" debugging at the interactive prompt, but don't try to use it beyond that.

    0 讨论(0)
  • 2020-11-22 16:06

    Try this, try calling the following:

    a = someClass()
    for i in range(0,44):
        print(someClass())
    print(a)
    

    You'll see something different. Why? Cause the memory that was released by the first object in the "foo" loop was reused. On the other hand a is not reused since it's retained.

    0 讨论(0)
  • 2020-11-22 16:12

    A example where the memory location (and id) is not released is:

    print([someClass() for i in range(10)])
    

    Now the ids are all unique.

    0 讨论(0)
  • 2020-11-22 16:20

    The id of an object is only guaranteed to be unique during that object's lifetime, not over the entire lifetime of a program. The two someClass objects you create only exist for the duration of the call to print - after that, they are available for garbage collection (and, in CPython, deallocated immediately). Since their lifetimes don't overlap, it is valid for them to share an id.

    It is also unsuprising in this case, because of a combination of two CPython implementation details: first, it does garbage collection by reference counting (with some extra magic to avoid problems with circular references), and second, the id of an object is related to the value of the underlying pointer for the variable (ie, its memory location). So, the first object, which was the most recent object allocated, is immediately freed - it isn't too surprising that the next object allocated will end up in the same spot (although this potentially also depends on details of how the interpreter was compiled).

    If you are relying on several objects having distinct ids, you might keep them around - say, in a list - so that their lifetimes overlap. Otherwise, you might implement a class-specific id that has different guarantees - eg:

    class SomeClass:
        next_id = 0
    
        def __init__(self):
             self.id = SomeClass.nextid
             SomeClass.nextid += 1
    
    0 讨论(0)
  • 2020-11-22 16:21

    It's releasing the first instance since it wasn't retained, then since nothing has happened to the memory in the meantime, it instantiates a second time to the same location.

    0 讨论(0)
提交回复
热议问题