I am trying to write a deeply nested set of classes, attributes, bound methods, etc. to a HDF5 file using the h5py module for long-term storage. I am really close. The onl
tl;dr Just call the is_object_pure_python()
function defined far, far below.
Like ibell, I was dutifully impressed by user4815162342's authoritative Python 2.x-specific solution. All is not well in Pythonic paradise, however.
That solution (though insightful) has suffered a bit of bit rot not trivially resolvable by simple edits, including:
L
type suffix is unsupported under Python 3.x. Admittedly, trivially resolvable.is_class_instance()
implementation fails to account for pure-Python classes optimized with __slots__
.is_class_instance()
implementation fails under non-CPython interpreters (e.g., pypy).To solve these issues, the following Python 3.x-specific solution drops L
, detects __slots__
, has been refactored so as to prefer the more reliable CPython-specific is_class_instance()
implementation under CPython and fallback to the less reliable cross-interpreter is_class_instance()
implementation under all other interpreters, and has been generalized to detect both classes and class instances.
For sanity, let's detect class instances first:
import platform
# If the active Python interpreter is the official CPython implementation,
# prefer a more reliable CPython-specific solution guaranteed to succeed.
if platform.python_implementation() == 'CPython':
# Magic number defined by the Python codebase at "Include/object.h".
Py_TPFLAGS_HEAPTYPE = (1<<9)
def is_instance_pure_python(obj: object) -> bool:
'''
`True` if the passed object is an instance of a pure-Python class _or_
`False` if this object is an instance of a C-based class (either builtin
or defined by a C extension).
'''
return bool(type(obj).__flags__ & Py_TPFLAGS_HEAPTYPE)
# Else, fallback to a CPython-agnostic solution typically but *NOT*
# necessarily succeeding. For all real-world objects of interest, this is
# effectively successful. Edge cases exist but are suitably rare.
else:
def is_instance_pure_python(obj: object) -> bool:
'''
`True` if the passed object is an instance of a pure-Python class _or_
`False` if this object is an instance of a C-based class (either builtin
or defined by a C extension).
'''
return hasattr(obj, '__dict__') or hasattr(obj, '__slots__')
Unit tests demonstrate the uncomfortable truth:
>>> class PurePythonWithDict(object): pass
>>> class PurePythonWithSlots(object): __slots__ = ()
>>> unslotted = PurePythonWithDict()
>>> slotted = PurePythonWithSlots()
>>> is_instance_pure_python(unslotted)
True
>>> is_instance_pure_python(slotted)
True
>>> is_instance_pure_python(3)
False
>>> is_instance_pure_python([3, 1, 4, 1, 5])
False
>>> import numpy
>>> is_instance_pure_python(numpy.array((3, 1, 4, 1, 5)))
False
Yes, but doing so is non-trivial. Detecting whether a class (rather than class instance) is pure-Python or C-based is oddly difficult. Why? Because even C-based classes provide the __dict__
attribute. Hence, hasattr(int, '__dict__') == True
.
Nonetheless, where this is a hacky way there is a hacky will. For unknown (probably banal) reasons, the dir()
builtin strips the __dict__
attribute name from its returned list only for C-based classes. Hence, detecting whether a class is pure-Python or C-based in a cross-interpreter manner reduces to iteratively searching the list returned by dir()
for __dict__
. For the win:
import platform
# If the active Python interpreter is the official CPython interpreter,
# prefer a more reliable CPython-specific solution guaranteed to succeed.
if platform.python_implementation() == 'CPython':
# Magic number defined by the Python codebase at "Include/object.h".
Py_TPFLAGS_HEAPTYPE = (1<<9)
def is_class_pure_python(cls: type) -> bool:
'''
`True` if the passed class is pure-Python _or_ `False` if this class
is C-based (either builtin or defined by a C extension).
'''
return bool(cls.__flags__ & Py_TPFLAGS_HEAPTYPE)
# Else, fallback to a CPython-agnostic solution typically but *NOT*
# necessarily succeeding. For all real-world objects of interest, this is
# effectively successful. Edge cases exist but are suitably rare.
else:
def is_class_pure_python(cls: type) -> bool:
'''
`True` if the passed class is pure-Python _or_ `False` if this class
is C-based (either builtin or defined by a C extension).
'''
return '__dict__' in dir(cls) or hasattr(cls, '__slots__')
More test-driven truthiness:
>>> class PurePythonWithDict(object): pass
>>> class PurePythonWithSlots(object): __slots__ = ()
>>> is_class_pure_python(PurePythonWithDict)
True
>>> is_class_pure_python(PurePythonWithSlots)
True
>>> is_class_pure_python(int)
False
>>> is_class_pure_python(list)
False
>>> import numpy
>>> is_class_pure_python(numpy.ndarray)
False
For generality, let's unify the low-level functions defined above into two high-level functions supporting all possible types under all possible Python interpreters:
def is_object_pure_python(obj: object) -> bool:
'''
`True` if the passed object is either a pure-Python class or instance of
such a class _or_ `False` if this object is either a C-based class
(builtin or defined by a C extension) or instance of such a class.
'''
if isinstance(obj, type):
return is_class_pure_python(obj)
else:
return is_instance_pure_python(obj)
def is_object_c_based(obj: object) -> bool:
'''
`True` if the passed object is either a C-based class (builtin or
defined by a C extension) or instance of such a class _or_ `False` if this
object is either a pure-Python class or instance of such a class.
'''
return not is_object_pure_python(obj)
Behold! Pure Python.