Passing multiprocessing.RawArray to a C++ function

旧巷老猫 提交于 2020-03-25 05:18:30

问题


My Python application creates an array shared between processes using multiprocessing.RawArray. Now to speed up computation I want to modify this array from within a C++ function. What is a safe way to pass a pointer to the underlying memory to a C++ function that accepts a void * argument?

The function is defined in a pxd file as:

cdef extern from 'lib/lib.hpp':
    void fun(void *buffer)

My naive attempt so far:

buffer = multiprocessing.RawArray(ctypes.c_ubyte, 10000)
clib.fun(ctypes.cast(self.queue_obj_buffer, ctypes.c_void_p))

This fails Cython compilation with the following error: Cannot convert Python object to 'void *' I also tried ctypes.addressof with similar results.

I do understand that I will need a method to query this pointer from every participating process individually, because this same region of memory will be mapped differently in process address spaces. But this is not an issue, so far I'm just struggling to get the pointer at all. Should I use a different approach altogether and allocate shared memory from within C++, or is it okay to do what I am doing?


回答1:


multiprocessing.RawArray is a ctypes.Array, so the address of the underlying buffer can be obtained via ctypes.addressof. This address can be reinterpreted as void *. Here is an example:

%%cython
# a small function for testing purposes:
cdef extern from *:
    """
    unsigned char get_first(void *ptr){
       unsigned char *ptr_as_ubytes = (unsigned char *)ptr;
       return ptr_as_ubytes[0];
    }
    """
    unsigned char get_first(void *ptr)


import ctypes
def first_element(buffer):
    cdef size_t ptr_address = ctypes.addressof(buffer) # size_t is big enough to hold the address
    return get_first(<void*> ptr_address)

Using <void*>ctypes.addressof(buffer) won't work, because Cython has no means for automatic conversion of a PyObject to void * - the (less readable) oneliner would be <void*><size_t> ctypes.addressof(buffer):

  • Cython can convert a Python-object to a raw size_t (or any integer) C-value.
  • a size_t C-value can be reinterpreted as void * in C-language.

Here is a small test of above example's functionality:

import multiprocessing
import ctypes
buffer = multiprocessing.RawArray(ctypes.c_ubyte, 10000)
buffer[0]=42
first_element(buffer)
# 42

If the signature of the C-function isn't expecting a void * but for example continuous memory of type unsigned char, so the approach from @oz1 is safer, as it not only protects data from being wrongly reinterpreted but also automatically checks that the buffer is continuous and has the right number of dimensions (done via typing as unsigned char[::1]).




回答2:


RawArray should have a buffer protocal, then it's easy to get the underlying pointer, since Cython has a good support for it via memory view, the following code should work:

%%cython

import ctypes
from multiprocessing.sharedctypes import RawArray

ctypedef unsigned char ubyte

cdef void func(void* buffer, int size):
    cdef ubyte *buf = <ubyte*>buffer
    cdef int i
    for i in range(size):
        buf[i] += 1


def test():
    cdef ubyte[::1] view = RawArray(ctypes.c_ubyte, [1,2,3,4,5])
    func(<void*>&view[0], len(view))
    print(list(view))

test()  # [2, 3, 4, 5, 6]

By your descriptions, you should have a look at Cython's support for shared memory parallelism



来源:https://stackoverflow.com/questions/60611145/passing-multiprocessing-rawarray-to-a-c-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!