问题
I have a C++ library which is built against the Apache Arrow C++ libraries, with a binding to python using Pybind. I'd like to be able to write a function in C++ to take a table constructed with PyArrow, like:
void test(arrow::Table test);
Passing in a PyArrow table like:
tab = pa.Table.from_pandas(df)
mybinding.test(tab)
If I do a naive function as above, I get:
TypeError: arrow_test(): incompatible function arguments. The following argument types are supported:
1. (arg0: arrow::Table) -> None
Invoked with: pyarrow.Table
I've also tried to write a function that takes a py::object
and .cast<arrow::Table>()
but I can't do the casting:
RuntimeError: Unable to cast Python instance to C++ type (compile in debug mode for details)
Does anyone have any idea how to get this to work?
回答1:
You have to use the functionality provided in the arrow/python/pyarrow.h
header. This header is auto-generated to support unwrapping the Cython pyarrow.Table
objects to C++ arrow::Table
instances. It is sufficient to build and link to libarrow.so
. It will also require the pyarrow
python packages loaded but this is solely a runtime, not a compile-time dependency.
// header that
#include <arrow/python/pyarrow.h>
// Ensure that the Python module was loaded
arrow::py::import_pyarrow();
PyObject* pyarrow_table = …
// With pybind11 you can also use
// pybind11::object pyarrow_table = …
// Convert PyObject* to native C++ object
std::shared_ptr<Table> table = unwrap_pyarrow_table(pyarrow_table);
来源:https://stackoverflow.com/questions/57863751/how-to-convert-pyarrow-table-to-arrow-table-when-interfacing-between-pyarrow-in