Python iterators – how to dynamically assign self.next within a new style class?

前端 未结 4 1945
感动是毒
感动是毒 2021-02-05 20:44

As part of some WSGI middleware I want to write a python class that wraps an iterator to implement a close method on the iterator.

This works fine when I try it with an

4条回答
  •  走了就别回头了
    2021-02-05 21:08

    There are a bunch of places where CPython take surprising shortcuts based on class properties instead of instance properties. This is one of those places.

    Here is a simple example that demonstrates the issue:

    def DynamicNext(object):
        def __init__(self):
            self.next = lambda: 42
    

    And here's what happens:

    >>> instance = DynamicNext()
    >>> next(instance)
    …
    TypeError: DynamicNext object is not an iterator
    >>>
    

    Now, digging into the CPython source code (from 2.7.2), here's the implementation of the next() builtin:

    static PyObject *
    builtin_next(PyObject *self, PyObject *args)
    {
        …
        if (!PyIter_Check(it)) {
            PyErr_Format(PyExc_TypeError,
                "%.200s object is not an iterator",
                it->ob_type->tp_name);
            return NULL;
        }
        …
    }
    

    And here's the implementation of PyIter_Check:

    #define PyIter_Check(obj) \
        (PyType_HasFeature((obj)->ob_type, Py_TPFLAGS_HAVE_ITER) && \
         (obj)->ob_type->tp_iternext != NULL && \
         (obj)->ob_type->tp_iternext != &_PyObject_NextNotImplemented)
    

    The first line, PyType_HasFeature(…), is, after expanding all the constants and macros and stuff, equivalent to DynamicNext.__class__.__flags__ & 1L<<17 != 0:

    >>> instance.__class__.__flags__ & 1L<<17 != 0
    True
    

    So that check obviously isn't failing… Which must mean that the next check — (obj)->ob_type->tp_iternext != NULLis failing.

    In Python, this line is roughly (roughly!) equivalent to hasattr(type(instance), "next"):

    >>> type(instance)
    __main__.DynamicNext
    >>> hasattr(type(instance), "next")
    False
    

    Which obviously fails because the DynamicNext type doesn't have a next method — only instances of that type do.

    Now, my CPython foo is weak, so I'm going to have to start making some educated guesses here… But I believe they are accurate.

    When a CPython type is created (that is, when the interpreter first evaluates the class block and the class' metaclass' __new__ method is called), the values on the type's PyTypeObject struct are initialized… So if, when the DynamicNext type is created, no next method exists, the tp_iternext, field will be set to NULL, causing PyIter_Check to return false.

    Now, as the Glenn points out, this is almost certainly a bug in CPython… Especially given that correcting it would only impact performance when either the object being tested isn't iterable or dynamically assigns a next method (very approximately):

    #define PyIter_Check(obj) \
        (((PyType_HasFeature((obj)->ob_type, Py_TPFLAGS_HAVE_ITER) && \
           (obj)->ob_type->tp_iternext != NULL && \
           (obj)->ob_type->tp_iternext != &_PyObject_NextNotImplemented)) || \
          (PyObject_HasAttrString((obj), "next") && \
           PyCallable_Check(PyObject_GetAttrString((obj), "next"))))
    

    Edit: after a little bit of digging, the fix would not be this simple, because at least some portions of the code assume that, if PyIter_Check(it) returns true, then *it->ob_type->tp_iternext will exist… Which isn't necessarily the case (ie, because the next function exists on the instance, not the type).

    SO! That's why surprising things happen when you try to iterate over a new-style instance with a dynamically assigned next method.

提交回复
热议问题