returning numpy arrays via pybind11

后端 未结 1 1977
情话喂你
情话喂你 2020-12-08 07:57

I have a C++ function computing a large tensor which I would like to return to Python as a NumPy array via pybind11.

From the documentation of pybind11, it seems li

相关标签:
1条回答
  • 2020-12-08 08:25

    A few comments (then a working implementation).

    • pybind11's C++ object wrappers around Python types (like pybind11::object, pybind11::list, and, in this case, pybind11::array_t<T>) are really just wrappers around an underlying Python object pointer. In this respect there are already taking on the role of a shared pointer wrapper, and so there's no point in wrapping that in a unique_ptr: returning the py::array_t<T> object directly is already essentially just returning a glorified pointer.
    • pybind11::array_t can be constructed directly from a data pointer, so you can skip the py::buffer_info intermediate step and just give the shape and strides directly to the pybind11::array_t constructor. A numpy array constructed this way won't own its own data, it'll just reference it (that is, the numpy owndata flag will be set to false).
    • Memory ownership can be tied to the life of a Python object, but you're still on the hook for doing the deallocation properly. Pybind11 provides a py::capsule class to help you do exactly this. What you want to do is make the numpy array depend on this capsule as its parent class by specifying it as the base argument to array_t. That will make the numpy array reference it, keeping it alive as long as the array itself is alive, and invoke the cleanup function when it is no longer referenced.
    • The c_style flag in the older (pre-2.2) releases only had an effect on new arrays, i.e. when not passing a value pointer. That was fixed in the 2.2 release to also affect the automatic strides if you specify only shapes but not strides. It has no effect at all if you specify the strides directly yourself (as I do in the example below).

    So, putting the pieces together, this code is a complete pybind11 module that demonstrates how you can accomplish what you're looking for (and includes some C++ output to demonstrate that is indeed working correctly):

    #include <iostream>
    #include <pybind11/pybind11.h>
    #include <pybind11/numpy.h>
    
    namespace py = pybind11;
    
    PYBIND11_PLUGIN(numpywrap) {
        py::module m("numpywrap");
        m.def("f", []() {
            // Allocate and initialize some data; make this big so
            // we can see the impact on the process memory use:
            constexpr size_t size = 100*1000*1000;
            double *foo = new double[size];
            for (size_t i = 0; i < size; i++) {
                foo[i] = (double) i;
            }
    
            // Create a Python object that will free the allocated
            // memory when destroyed:
            py::capsule free_when_done(foo, [](void *f) {
                double *foo = reinterpret_cast<double *>(f);
                std::cerr << "Element [0] = " << foo[0] << "\n";
                std::cerr << "freeing memory @ " << f << "\n";
                delete[] foo;
            });
    
            return py::array_t<double>(
                {100, 1000, 1000}, // shape
                {1000*1000*8, 1000*8, 8}, // C-style contiguous strides for double
                foo, // the data pointer
                free_when_done); // numpy array references this parent
        });
        return m.ptr();
    }
    

    Compiling that and invoking it from Python shows it working:

    >>> import numpywrap
    >>> z = numpywrap.f()
    >>> # the python process is now taking up a bit more than 800MB memory
    >>> z[1,1,1]
    1001001.0
    >>> z[0,0,100]
    100.0
    >>> z[99,999,999]
    99999999.0
    >>> z[0,0,0] = 3.141592
    >>> del z
    Element [0] = 3.14159
    freeing memory @ 0x7fd769f12010
    >>> # python process memory size has dropped back down
    
    0 讨论(0)
提交回复
热议问题