问题
My Python application creates an array shared between processes using multiprocessing.RawArray
. Now to speed up computation I want to modify this array from within a C++ function. What is a safe way to pass a pointer to the underlying memory to a C++ function that accepts a void *
argument?
The function is defined in a pxd
file as:
cdef extern from 'lib/lib.hpp':
void fun(void *buffer)
My naive attempt so far:
buffer = multiprocessing.RawArray(ctypes.c_ubyte, 10000)
clib.fun(ctypes.cast(self.queue_obj_buffer, ctypes.c_void_p))
This fails Cython compilation with the following error: Cannot convert Python object to 'void *'
I also tried ctypes.addressof
with similar results.
I do understand that I will need a method to query this pointer from every participating process individually, because this same region of memory will be mapped differently in process address spaces. But this is not an issue, so far I'm just struggling to get the pointer at all. Should I use a different approach altogether and allocate shared memory from within C++, or is it okay to do what I am doing?
回答1:
multiprocessing.RawArray
is a ctypes.Array, so the address of the underlying buffer can be obtained via ctypes.addressof. This address can be reinterpreted as void *
. Here is an example:
%%cython
# a small function for testing purposes:
cdef extern from *:
"""
unsigned char get_first(void *ptr){
unsigned char *ptr_as_ubytes = (unsigned char *)ptr;
return ptr_as_ubytes[0];
}
"""
unsigned char get_first(void *ptr)
import ctypes
def first_element(buffer):
cdef size_t ptr_address = ctypes.addressof(buffer) # size_t is big enough to hold the address
return get_first(<void*> ptr_address)
Using <void*>ctypes.addressof(buffer)
won't work, because Cython has no means for automatic conversion of a PyObject
to void *
- the (less readable) oneliner would be <void*><size_t> ctypes.addressof(buffer)
:
- Cython can convert a Python-object to a raw
size_t
(or any integer) C-value. - a
size_t
C-value can be reinterpreted asvoid *
in C-language.
Here is a small test of above example's functionality:
import multiprocessing
import ctypes
buffer = multiprocessing.RawArray(ctypes.c_ubyte, 10000)
buffer[0]=42
first_element(buffer)
# 42
If the signature of the C-function isn't expecting a void *
but for example continuous memory of type unsigned char
, so the approach from @oz1 is safer, as it not only protects data from being wrongly reinterpreted but also automatically checks that the buffer is continuous and has the right number of dimensions (done via typing as unsigned char[::1]
).
回答2:
RawArray
should have a buffer protocal, then it's easy to get the underlying pointer, since Cython has a good support for it via memory view, the following code should work:
%%cython
import ctypes
from multiprocessing.sharedctypes import RawArray
ctypedef unsigned char ubyte
cdef void func(void* buffer, int size):
cdef ubyte *buf = <ubyte*>buffer
cdef int i
for i in range(size):
buf[i] += 1
def test():
cdef ubyte[::1] view = RawArray(ctypes.c_ubyte, [1,2,3,4,5])
func(<void*>&view[0], len(view))
print(list(view))
test() # [2, 3, 4, 5, 6]
By your descriptions, you should have a look at Cython's support for shared memory parallelism
来源:https://stackoverflow.com/questions/60611145/passing-multiprocessing-rawarray-to-a-c-function