Passing 3-dimensional numpy array to C

前端 未结 4 1003
旧巷少年郎
旧巷少年郎 2021-02-01 09:43

I\'m writing a C extension to my Python program for speed purposes, and running into some very strange behaviour trying to pass in a 3-dimensional numpy array. It works with a 2

相关标签:
4条回答
  • 2021-02-01 10:22

    Rather than converting to a c-style array, I usually access numpy array elements directly using PyArray_GETPTR (see http://docs.scipy.org/doc/numpy/reference/c-api.array.html#data-access).

    For instance, to access an element of a 3-dimensional numpy array of type double use double elem=*((double *)PyArray_GETPTR3(list3_obj,i,j,k)).

    For your application, you could detect the correct number of dimensions for each array using PyArray_NDIM, then access elements using the appropriate version of PyArray_GETPTR.

    0 讨论(0)
  • 2021-02-01 10:23

    I already mentioned this in a comment, but I hope flushing it out a little helps make it more clear.

    When you're working with numpy arrays in C it's good to be explicit about the typing of your arrays. Specifically it looks like you're declaring your pointers as double ***list3, but they way you're creating l3 in your python code you'll get an array with dtype npy_intp (I think). You can fix this by explicitly using the dtype when creating your arrays.

    import cmod, numpy
    l2 = numpy.array([[1.0,2.0,3.0],
                      [4.0,5.0,6.0],
                      [7.0,8.0,9.0],
                      [3.0, 5.0, 0.0]], dtype="double")
    
    l3 = numpy.array([[[2,7, 1, 11], [6, 3, 9, 12]],
                      [[1, 10, 13, 15], [4, 2, 6, 2]]], dtype="double")
    
    cmod.func(l2, l3)
    

    Another note, because of the way python works it's nearly impossible for "line A" and "line B" to have any effect on the C code what so ever. I know that this seems to conflict with your empirical experience, but I'm pretty sure on this point.

    I'm a little less sure about this, but based on my experience with C, bus-errors and segfaults are not deterministic. They depend on memory allocation, alignment, and addresses. In some situation code seems to run fine 10 times, and fails on the 11th run even though nothing has changed.

    Have you considered using cython? I know it's not an option for everyone, but if it is an option you could get nearly C level speedups using typed memoryviews.

    0 讨论(0)
  • 2021-02-01 10:26

    According to http://docs.scipy.org/doc/numpy/reference/c-api.array.html?highlight=pyarray_ascarray#PyArray_AsCArray:

    Note The simulation of a C-style array is not complete for 2-d and 3-d arrays. For example, the simulated arrays of pointers cannot be passed to subroutines expecting specific, statically-defined 2-d and 3-d arrays. To pass to functions requiring those kind of inputs, you must statically define the required array and copy data.

    I think that this means that PyArray_AsCArray returns a block of memory with the data in it in C order. However, to access that data, more information is needed (see http://www.phy225.dept.shef.ac.uk/mediawiki/index.php/Arrays,_dynamic_array_allocation). This can either be achieved by knowing the dimensions ahead of time, declaring an array, and then copying the data in in the right order. However, I suspect that more general case is more useful: you don't know the dimensions until they are returned. I think that the following code will create the necessary C pointer framework to allow the data to be addressed.

    static PyObject* func(PyObject* self, PyObject* args) {
        PyObject *list2_obj;
        PyObject *list3_obj;
        if (!PyArg_ParseTuple(args, "OO", &list2_obj, &list3_obj)) return NULL;
    
        double **list2;
        double ***list3;
    
        // For the final version
        double **final_array2;
        double **final_array2;
    
        // For loops
        int i,j;
    
        //Create C arrays from numpy objects:
        int typenum = NPY_DOUBLE;
        PyArray_Descr *descr;
        descr = PyArray_DescrFromType(typenum);
    
        // One per array coming back ...
        npy_intp dims2[2];
        npy_intp dims3[3];
    
        if (PyArray_AsCArray(&list2_obj, (void **)&list2, dims2, 2, descr) < 0 || PyArray_AsCArray(&list3_obj, (void ***)&list3, dims3, 3, descr) < 0) {
            PyErr_SetString(PyExc_TypeError, "error converting to c array");
            return NULL;
        }
    
        // Create the pointer arrays needed to access the data
    
        // 2D array
        final_array2 = calloc(dim2[0], sizeof(double *));
        for (i=0; i<dim[0]; i++) final_array2[i] = list2 + dim2[1]*sizeof(double);
    
        // 2D array
        final_array3    = calloc(dim3[0], sizeof(double **));
        final_array3[0] = calloc(dim3[0]*dim3[1], sizeof(double *));
        for (i=0; i<dim[0]; i++) {
             final_array3[i] = list2 + dim3[1]*sizeof(double *);
             for (j=0; j<dim[1]; j++) {
                 final_array[i][j] = final_array[i] + dim3[2]*sizeof(double);
             }
        }
    
        printf("2D: %f, 3D: %f.\n", final_array2[3][1], final_array3[1][0][2]);
        // Do stuff with the arrays
    
        // When ready to complete, free the array access stuff
        free(final_array2);
    
        free(final_array3[0]);
        free(final_array3);
    
        // I would guess you also need to free the stuff allocated by PyArray_AsCArray, if so:
        free(list2);
        free(list3);
    }
    

    I couldn't find a definition for npy_intp, the above assumes it is the same as int. If it isn't you will need to convert dim2 and dim3 into int arrays before doing the code.

    0 讨论(0)
  • 2021-02-01 10:34

    There was a bug in the numpy C-API, that should be fixed now:

    https://github.com/numpy/numpy/pull/5314

    0 讨论(0)
提交回复
热议问题