How to tidy/fix PyCXX's creation of new-style Python extension-class?

前端 未结 2 1228
情深已故
情深已故 2021-01-24 00:44

I\'ve nearly finished rewriting a C++ Python wrapper (PyCXX).

The original allows old and new style extension classes, but also allows one to derive from the new-style c

2条回答
  •  失恋的感觉
    2021-01-24 01:12

    PyCXX is not convoluted. It does have two bugs, but they can be easily fixed without requiring significant changes to the code.

    When creating a C++ wrapper for the Python API, one encounters a problem. The C++ object model and the Python new-style object model are very different. One fundamental difference is that C++ has a single constructor that both creates and initializes the object. While Python has two stages; tp_new creates the object and performs minimal intialization (or just returns an existing object) and tp_init performs the rest of the initialization.

    PEP 253, which you should probably read in its entirety, says:

    The difference in responsibilities between the tp_new() slot and the tp_init() slot lies in the invariants they ensure. The tp_new() slot should ensure only the most essential invariants, without which the C code that implements the objects would break. The tp_init() slot should be used for overridable user-specific initializations. Take for example the dictionary type. The implementation has an internal pointer to a hash table which should never be NULL. This invariant is taken care of by the tp_new() slot for dictionaries. The dictionary tp_init() slot, on the other hand, could be used to give the dictionary an initial set of keys and values based on the arguments passed in.

    ...

    You may wonder why the tp_new() slot shouldn't call the tp_init() slot itself. The reason is that in certain circumstances (like support for persistent objects), it is important to be able to create an object of a particular type without initializing it any further than necessary. This may conveniently be done by calling the tp_new() slot without calling tp_init(). It is also possible hat tp_init() is not called, or called more than once -- its operation should be robust even in these anomalous cases.

    The entire point of a C++ wrapper is to enable you to write nice C++ code. Say for example that you want your object to have a data member that can only be initialized during its construction. If you create the object during tp_new, then you cannot reinitialize that data member during tp_init. This will probably force you to hold that data member via some kind of a smart pointer and create it during tp_new. This makes the code ugly.

    The approach PyCXX takes is to separate object construction into two:

    • tp_new creates a dummy object with just a pointer to the C++ object which is created tp_init. This pointer is initially null.

    • tp_init allocates and constructs the actual C++ object, then updates the pointer in the dummy object created in tp_new to point to it. If tp_init is called more than once it raises a Python exception.

    I personally think that the overhead of this approach for my own applications is too high, but it's a legitimate approach. I have my own C++ wrapper around the Python C/API that does all the initialization in tp_new, which is also flawed. There doesn't appear to be a good solution for that.

提交回复
热议问题