Prevent TextIOWrapper from closing on GC in a Py2/Py3 compatible way

前端 未结 4 648
面向向阳花
面向向阳花 2021-01-12 02:34

What I need to accomplish:

Given a binary file, decode it in a couple different ways providing a TextIOBase API. Ideally these subseque

相关标签:
4条回答
  • 2021-01-12 02:55

    EDIT:

    I found a much better solution (comparatively), but I will leave this answer in the event it is useful for anyone to learn from. (It is a pretty easy way to show off gc.garbage)

    Please do not actually use what follows.

    OLD:

    I found a potential solution, though it is horrible:

    What we can do is set up a cyclic reference in the destructor, which will hold off the GC event. We can then look at the garbage of gc to find these unreferenceable objects, break the cycle, and drop that reference.

    In [1]: import io
    
    In [2]: class MyTextIOWrapper(io.TextIOWrapper):
       ...:     def __del__(self):
       ...:         if not hasattr(self, '_cycle'):
       ...:             print "holding off GC"
       ...:             self._cycle = self
       ...:         else:
       ...:             print "getting GCed!"
       ...:
    
    In [3]: def mangle(x):
       ...:     MyTextIOWrapper(x)
       ...:     
    
    In [4]: f = io.open('example', mode='rb')
    
    In [5]: mangle(f)
    holding off GC
    
    In [6]: f.closed
    Out[6]: False
    
    In [7]: import gc
    
    In [8]: gc.garbage
    Out[8]: []
    
    In [9]: gc.collect()
    Out[9]: 34
    
    In [10]: gc.garbage
    Out[10]: [<_io.TextIOWrapper name='example' encoding='UTF-8'>]
    
    In [11]: gc.garbage[0]._cycle=False
    
    In [12]: del gc.garbage[0]
    getting GCed!
    
    In [13]: f.closed
    Out[13]: True
    

    Truthfully this is a pretty horrific workaround, but it could be transparent to the API I am delivering. Still I would prefer a way to override the __del__ of IOBase.

    0 讨论(0)
  • 2021-01-12 03:08

    Just detach your TextIOWrapper() object before letting it be garbage collected:

    def mangle(x):
        wrapper = io.TextIOWrapper(x)
        wrapper.detach()
    

    The TextIOWrapper() object only closes streams it is attached to. If you can't alter the code where the object goes out of scope, then simply keep a reference to the TextIOWrapper() object locally and detach at that point.

    If you must subclass TextIOWrapper(), then just call detach() in the __del__ hook:

    class DetachingTextIOWrapper(io.TextIOWrapper):
        def __del__(self):
            self.detach()
    
    0 讨论(0)
  • 2021-01-12 03:16

    A simple solution would be to return the variable from the function and store it in script scope, so that it does not get garbage collected until the script ends or the reference to it changes. But there may be other elegant solutions out there.

    0 讨论(0)
  • 2021-01-12 03:17

    EDIT:

    Just call detach first, thanks martijn-pieters!


    It turns out there is basically nothing that can be done about the deconstructor calling close in Python 2.7. This is hardcoded into the C code. Instead we can modify close such that it won't close the buffer when __del__ is happening (__del__ will be executed before _PyIOBase_finalize in the C code giving us a chance to change the behaviour of close). This lets close work as expected without letting the GC close the buffer.

    class SaneTextIOWrapper(io.TextIOWrapper):
        def __init__(self, *args, **kwargs):
            self._should_close_buffer = True
            super(SaneTextIOWrapper, self).__init__(*args, **kwargs)
    
        def __del__(self):
            # Accept the inevitability of the buffer being closed by the destructor
            # because of this line in Python 2.7:
            # https://github.com/python/cpython/blob/2.7/Modules/_io/iobase.c#L221
            self._should_close_buffer = False
            self.close()  # Actually close for Python 3 because it is an override.
                          # We can't call super because Python 2 doesn't actually
                          # have a `__del__` method for IOBase (hence this
                          # workaround). Close is idempotent so it won't matter
                          # that Python 2 will end up calling this twice
    
        def close(self):
            # We can't stop Python 2.7 from calling close in the deconstructor
            # so instead we can prevent the buffer from being closed with a flag.
    
            # Based on:
            # https://github.com/python/cpython/blob/2.7/Lib/_pyio.py#L1586
            # https://github.com/python/cpython/blob/3.4/Lib/_pyio.py#L1615
            if self.buffer is not None and not self.closed:
                try:
                    self.flush()
                finally:
                    if self._should_close_buffer:
                        self.buffer.close()
    

    My previous solution here used _pyio.TextIOWrapper which is slower than the above because it is written in Python, not C.

    It involved simply overriding __del__ with a noop which will also work in Py2/3.

    0 讨论(0)
提交回复
热议问题