Why dill dumps external classes by reference, no matter what?

后端 未结 1 1531
故里飘歌
故里飘歌 2021-01-14 10:10

In the example below, I have placed the class Foo inside its own module foo.

Why is the external class dumped by ref? The instance f

1条回答
  •  一整个雨季
    2021-01-14 10:59

    I'm the dill author. The foo.Foo instance (ff) pickles by reference because it's defined in a file. This is primarily for compactness of the pickled string. So the primary issue I can think of when importing a class by reference is that the class definition is not found on the other resource you might want to unpickle to (i.e. no module foo exists there). I believe that's a current feature request (and if it's not, feel free to submit a ticket on the github page).

    Note, however, if you do modify the class dynamically, it does pull in the dynamically modified code to the pickled string.

    >>> import dill
    >>> import foo
    >>> 
    >>> class Foo:
    ...     y = 1
    ...     def bar( self, x ):
    ...         return x + Foo.y
    ... 
    >>> f = Foo()
    >>> ff = foo.Foo()
    

    So when Foo is defined in __main__, byref is respected.

    >>> dill.dumps(f, byref=False)              
    b'\x80\x03cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01X\x04\x00\x00\x00typeq\x02\x85q\x03Rq\x04X\x03\x00\x00\x00Fooq\x05h\x01X\x06\x00\x00\x00objectq\x06\x85q\x07Rq\x08\x85q\t}q\n(X\r\x00\x00\x00__slotnames__q\x0b]q\x0cX\x03\x00\x00\x00barq\rcdill.dill\n_create_function\nq\x0e(cdill.dill\n_unmarshal\nq\x0fC]\xe3\x02\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x0b\x00\x00\x00|\x01\x00t\x00\x00j\x01\x00\x17S)\x01N)\x02\xda\x03Foo\xda\x01y)\x02\xda\x04self\xda\x01x\xa9\x00r\x05\x00\x00\x00\xfa\x07\xda\x03bar\x03\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x10\x85q\x11Rq\x12c__builtin__\n__main__\nh\rNN}q\x13tq\x14Rq\x15X\x07\x00\x00\x00__doc__q\x16NX\n\x00\x00\x00__module__q\x17X\x08\x00\x00\x00__main__q\x18X\x01\x00\x00\x00yq\x19K\x01utq\x1aRq\x1b)\x81q\x1c.'
    >>> dill.dumps(f, byref=True)
    b'\x80\x03c__main__\nFoo\nq\x00)\x81q\x01.'
    >>>
    

    However, when the class is defined in a module, byref is not respected.

    >>> dill.dumps(ff, byref=False)
    b'\x80\x03cfoo\nFoo\nq\x00)\x81q\x01.'
    >>> dill.dumps(ff, byref=True)
    b'\x80\x03cfoo\nFoo\nq\x00)\x81q\x01.'
    

    Note, that I wouldn't use the recurse option in this case, as Foo.y will likely infinitely recurse. That's also something that I believe there's current ticket for, but if there isn't, there should be.

    Let's dig a little deeper… what if we modify the instance...

    >>> ff.zap = lambda x: x + ff.y
    >>> _ff = dill.loads(dill.dumps(ff))
    >>> _ff.zap(2)
    3
    >>> dill.dumps(ff, byref=True)
    b'\x80\x03cfoo\nFoo\nq\x00)\x81q\x01}q\x02X\x03\x00\x00\x00zapq\x03cdill.dill\n_create_function\nq\x04(cdill.dill\n_unmarshal\nq\x05CY\xe3\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x0b\x00\x00\x00|\x00\x00t\x00\x00j\x01\x00\x17S)\x01N)\x02\xda\x02ff\xda\x01y)\x01\xda\x01x\xa9\x00r\x04\x00\x00\x00\xfa\x07\xda\x08\x01\x00\x00\x00s\x00\x00\x00\x00q\x06\x85q\x07Rq\x08c__builtin__\n__main__\nX\x08\x00\x00\x00q\tNN}q\ntq\x0bRq\x0csb.'
    >>> dill.dumps(ff, byref=False)
    b'\x80\x03cfoo\nFoo\nq\x00)\x81q\x01}q\x02X\x03\x00\x00\x00zapq\x03cdill.dill\n_create_function\nq\x04(cdill.dill\n_unmarshal\nq\x05CY\xe3\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x0b\x00\x00\x00|\x00\x00t\x00\x00j\x01\x00\x17S)\x01N)\x02\xda\x02ff\xda\x01y)\x01\xda\x01x\xa9\x00r\x04\x00\x00\x00\xfa\x07\xda\x08\x01\x00\x00\x00s\x00\x00\x00\x00q\x06\x85q\x07Rq\x08c__builtin__\n__main__\nX\x08\x00\x00\x00q\tNN}q\ntq\x0bRq\x0csb.'
    >>> 
    

    No biggie, it pulls in the dynamically added code. However, we'd probably like to modify Foo and not the instance.

    >>> Foo.zap = lambda self,x: x + Foo.y
    >>> dill.dumps(f, byref=True)
    b'\x80\x03c__main__\nFoo\nq\x00)\x81q\x01.'
    >>> dill.dumps(f, byref=False)
    b'\x80\x03cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01X\x04\x00\x00\x00typeq\x02\x85q\x03Rq\x04X\x03\x00\x00\x00Fooq\x05h\x01X\x06\x00\x00\x00objectq\x06\x85q\x07Rq\x08\x85q\t}q\n(X\x03\x00\x00\x00barq\x0bcdill.dill\n_create_function\nq\x0c(cdill.dill\n_unmarshal\nq\rC]\xe3\x02\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x0b\x00\x00\x00|\x01\x00t\x00\x00j\x01\x00\x17S)\x01N)\x02\xda\x03Foo\xda\x01y)\x02\xda\x04self\xda\x01x\xa9\x00r\x05\x00\x00\x00\xfa\x07\xda\x03bar\x03\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x0e\x85q\x0fRq\x10c__builtin__\n__main__\nh\x0bNN}q\x11tq\x12Rq\x13X\x07\x00\x00\x00__doc__q\x14NX\r\x00\x00\x00__slotnames__q\x15]q\x16X\n\x00\x00\x00__module__q\x17X\x08\x00\x00\x00__main__q\x18X\x01\x00\x00\x00yq\x19K\x01X\x03\x00\x00\x00zapq\x1ah\x0c(h\rC`\xe3\x02\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x0b\x00\x00\x00|\x01\x00t\x00\x00j\x01\x00\x17S)\x01N)\x02\xda\x03Foo\xda\x01y)\x02\xda\x04self\xda\x01x\xa9\x00r\x05\x00\x00\x00\xfa\x07\xda\x08\x01\x00\x00\x00s\x00\x00\x00\x00q\x1b\x85q\x1cRq\x1dc__builtin__\n__main__\nX\x08\x00\x00\x00q\x1eNN}q\x1ftq Rq!utq"Rq#)\x81q$.'
    

    Ok, that's fine, but what about the Foo in our external module?

    >>> ff = foo.Foo()
    >>> 
    >>> foo.Foo.zap = lambda self,x: x + foo.Foo.y
    >>> dill.dumps(ff, byref=False)
    b'\x80\x03cfoo\nFoo\nq\x00)\x81q\x01.'
    >>> dill.dumps(ff, byref=True)
    b'\x80\x03cfoo\nFoo\nq\x00)\x81q\x01.'
    >>> 
    

    Hmmm… not good. So the above is probably a pretty compelling use case to change the behavior dill exhibits for classes defined in modules -- or at least enable one of the settings to provide better behavior.

    In sum, the answer is: we didn't have a use case for it, so now that we do… this should be a feature request if it is not already.

    0 讨论(0)
提交回复
热议问题