In python at runtime determine if an object is a class (old and new type) instance

前端 未结 3 1193
南旧
南旧 2021-01-13 10:01

I am trying to write a deeply nested set of classes, attributes, bound methods, etc. to a HDF5 file using the h5py module for long-term storage. I am really close. The onl

相关标签:
3条回答
  • 2021-01-13 10:31

    While the poster might most likely need to rethink his design, in some cases there is a legitimate need to distinguish between instances of built-in/extension types, created in C, and instances of classes created in Python with the class statement. While both are types, the latter are a category of types that CPython internally calls "heap types" because their type structures are allocated at run-time. That python continues to distinguish them can be seen in __repr__ output:

    >>> int       # "type"
    <type 'int'>
    >>> class X(object): pass
    ... 
    >>> X         # "class"
    <class '__main__.X'>
    

    The __repr__ distinction is implemented exactly by checking whether the type is a heap type.

    Depending on the exact needs of the application, an is_class_instance function can be implemented in one of the following ways:

    # Built-in types such as int or object do not have __dict__ by
    # default. __dict__ is normally obtained by inheriting from a
    # dictless type using the class statement.  Checking for the
    # existence of __dict__ is an indication of a class instance.
    #
    # Caveat: a built-in or extension type can still request instance
    # dicts using tp_dictoffset, and a class can suppress it with
    # __slots__.
    def is_class_instance(o):
        return hasattr(o, '__dict__')
    
    # A reliable approach, but one that is also more dependent
    # on the CPython implementation.
    Py_TPFLAGS_HEAPTYPE = (1<<9)       # Include/object.h
    def is_class_instance(o):
        return bool(type(o).__flags__ & Py_TPFLAGS_HEAPTYPE)
    

    EDIT

    Here is an explanation of the second version of the function. It really tests whether the type is a "heap type" using the same test that CPython uses internally for its own purposes. That ensures that it will always return True for instances of heap types ("classes") and False for instances of non-heap-types ("types", but also old-style classes, which is easy to fix). It does that by checking whether the tp_flags member of the C-level PyTypeObject structure has the Py_TPFLAGS_HEAPTYPE bit set. The weak part of the implementation is that it hardcodes the value of the Py_TPFLAGS_HEAPTYPE constant to the currently observed value. (This is necessary because the constant is not exposed to Python by a symbolic name.) While in theory this constant could change, it is highly unlikely to happen in practice because such a change would gratuitously break the ABI of existing extension modules. Looking at the definitions of Py_TPFLAGS constants in Include/object.h, it is apparent that new ones are being carefully added without disturbing the old ones. Another weakness is that this code has zero chance running on a non-CPython implementation, such as Jython or IronPython.

    0 讨论(0)
  • 2021-01-13 10:37

    Thanks to @user4815162342, I have been able to get this to work. Here is a slightly modified version that will return True for instances of old- and new-style classes:

    #Added the check for old-style class
    Py_TPFLAGS_HEAPTYPE = (1L<<9)       # Include/object.h
    def is_class_instance(o):
        import types
        return (bool(type(o).__flags__ & Py_TPFLAGS_HEAPTYPE) 
                or type(o) is types.InstanceType)
    
    0 讨论(0)
  • 2021-01-13 10:46

    tl;dr Just call the is_object_pure_python() function defined far, far below.

    Like ibell, I was dutifully impressed by user4815162342's authoritative Python 2.x-specific solution. All is not well in Pythonic paradise, however.

    Problems. Problems Everywhere.

    That solution (though insightful) has suffered a bit of bit rot not trivially resolvable by simple edits, including:

    • The L type suffix is unsupported under Python 3.x. Admittedly, trivially resolvable.
    • The cross-interpreter is_class_instance() implementation fails to account for pure-Python classes optimized with __slots__.
    • The CPython-specific is_class_instance() implementation fails under non-CPython interpreters (e.g., pypy).
    • No comparable implementations for detecting whether classes (rather than class instances) are pure-Python or C-based.

    Solutions! Solutions Everywhere!

    To solve these issues, the following Python 3.x-specific solution drops L, detects __slots__, has been refactored so as to prefer the more reliable CPython-specific is_class_instance() implementation under CPython and fallback to the less reliable cross-interpreter is_class_instance() implementation under all other interpreters, and has been generalized to detect both classes and class instances.

    For sanity, let's detect class instances first:

    import platform
    
    # If the active Python interpreter is the official CPython implementation,
    # prefer a more reliable CPython-specific solution guaranteed to succeed.
    if platform.python_implementation() == 'CPython':
        # Magic number defined by the Python codebase at "Include/object.h".
        Py_TPFLAGS_HEAPTYPE = (1<<9)
    
        def is_instance_pure_python(obj: object) -> bool:
            '''
            `True` if the passed object is an instance of a pure-Python class _or_
            `False` if this object is an instance of a C-based class (either builtin
            or defined by a C extension).
            '''
    
            return bool(type(obj).__flags__ & Py_TPFLAGS_HEAPTYPE)
    
    # Else, fallback to a CPython-agnostic solution typically but *NOT*
    # necessarily succeeding. For all real-world objects of interest, this is
    # effectively successful. Edge cases exist but are suitably rare.
    else:
        def is_instance_pure_python(obj: object) -> bool:
            '''
            `True` if the passed object is an instance of a pure-Python class _or_
            `False` if this object is an instance of a C-based class (either builtin
            or defined by a C extension).
            '''
    
            return hasattr(obj, '__dict__') or hasattr(obj, '__slots__')
    

    The Proof is in Guido's Pudding

    Unit tests demonstrate the uncomfortable truth:

    >>> class PurePythonWithDict(object): pass
    >>> class PurePythonWithSlots(object): __slots__ = ()
    >>> unslotted = PurePythonWithDict()
    >>> slotted = PurePythonWithSlots()
    >>> is_instance_pure_python(unslotted)
    True
    >>> is_instance_pure_python(slotted)
    True
    >>> is_instance_pure_python(3)
    False
    >>> is_instance_pure_python([3, 1, 4, 1, 5])
    False
    >>> import numpy
    >>> is_instance_pure_python(numpy.array((3, 1, 4, 1, 5)))
    False
    

    Does This Generalize to Classes without Instances?

    Yes, but doing so is non-trivial. Detecting whether a class (rather than class instance) is pure-Python or C-based is oddly difficult. Why? Because even C-based classes provide the __dict__ attribute. Hence, hasattr(int, '__dict__') == True.

    Nonetheless, where this is a hacky way there is a hacky will. For unknown (probably banal) reasons, the dir() builtin strips the __dict__ attribute name from its returned list only for C-based classes. Hence, detecting whether a class is pure-Python or C-based in a cross-interpreter manner reduces to iteratively searching the list returned by dir() for __dict__. For the win:

    import platform
    
    # If the active Python interpreter is the official CPython interpreter,
    # prefer a more reliable CPython-specific solution guaranteed to succeed.
    if platform.python_implementation() == 'CPython':
        # Magic number defined by the Python codebase at "Include/object.h".
        Py_TPFLAGS_HEAPTYPE = (1<<9)
    
        def is_class_pure_python(cls: type) -> bool:
            '''
            `True` if the passed class is pure-Python _or_ `False` if this class
            is C-based (either builtin or defined by a C extension).
            '''
    
            return bool(cls.__flags__ & Py_TPFLAGS_HEAPTYPE)
    
    # Else, fallback to a CPython-agnostic solution typically but *NOT*
    # necessarily succeeding. For all real-world objects of interest, this is
    # effectively successful. Edge cases exist but are suitably rare.
    else:
        def is_class_pure_python(cls: type) -> bool:
            '''
            `True` if the passed class is pure-Python _or_ `False` if this class
            is C-based (either builtin or defined by a C extension).
            '''
    
            return '__dict__' in dir(cls) or hasattr(cls, '__slots__')
    

    More Proof. More Pudding.

    More test-driven truthiness:

    >>> class PurePythonWithDict(object): pass
    >>> class PurePythonWithSlots(object): __slots__ = ()
    >>> is_class_pure_python(PurePythonWithDict)
    True
    >>> is_class_pure_python(PurePythonWithSlots)
    True
    >>> is_class_pure_python(int)
    False
    >>> is_class_pure_python(list)
    False
    >>> import numpy
    >>> is_class_pure_python(numpy.ndarray)
    False
    

    That's All She Wrote

    For generality, let's unify the low-level functions defined above into two high-level functions supporting all possible types under all possible Python interpreters:

    def is_object_pure_python(obj: object) -> bool:
       '''
       `True` if the passed object is either a pure-Python class or instance of
       such a class _or_ `False` if this object is either a C-based class
       (builtin or defined by a C extension) or instance of such a class.
       '''
    
       if isinstance(obj, type):
           return is_class_pure_python(obj)
       else:
           return is_instance_pure_python(obj)
    
    
    def is_object_c_based(obj: object) -> bool:
       '''
       `True` if the passed object is either a C-based class (builtin or
       defined by a C extension) or instance of such a class _or_ `False` if this
       object is either a pure-Python class or instance of such a class.
       '''
    
       return not is_object_pure_python(obj)
    

    Behold! Pure Python.

    0 讨论(0)
提交回复
热议问题