How to convert PyArrow table to Arrow table when interfacing between PyArrow in python and Arrow in C++

旧时模样 提交于 2021-02-08 04:37:27

问题


I have a C++ library which is built against the Apache Arrow C++ libraries, with a binding to python using Pybind. I'd like to be able to write a function in C++ to take a table constructed with PyArrow, like:

void test(arrow::Table test);

Passing in a PyArrow table like:

tab = pa.Table.from_pandas(df)
mybinding.test(tab)

If I do a naive function as above, I get:

TypeError: arrow_test(): incompatible function arguments. The following argument types are supported:
    1. (arg0: arrow::Table) -> None

Invoked with: pyarrow.Table

I've also tried to write a function that takes a py::object and .cast<arrow::Table>() but I can't do the casting:

RuntimeError: Unable to cast Python instance to C++ type (compile in debug mode for details)

Does anyone have any idea how to get this to work?


回答1:


You have to use the functionality provided in the arrow/python/pyarrow.h header. This header is auto-generated to support unwrapping the Cython pyarrow.Table objects to C++ arrow::Table instances. It is sufficient to build and link to libarrow.so. It will also require the pyarrow python packages loaded but this is solely a runtime, not a compile-time dependency.

// header that 
#include <arrow/python/pyarrow.h>

// Ensure that the Python module was loaded
arrow::py::import_pyarrow();

PyObject* pyarrow_table = …
// With pybind11 you can also use
// pybind11::object pyarrow_table = …

// Convert PyObject* to native C++ object
std::shared_ptr<Table> table = unwrap_pyarrow_table(pyarrow_table);


来源:https://stackoverflow.com/questions/57863751/how-to-convert-pyarrow-table-to-arrow-table-when-interfacing-between-pyarrow-in

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!