How to use 2D Arrays in CUDA?

后端 未结 2 1605
故里飘歌
故里飘歌 2020-11-27 02:55

I am new to CUDA. How to allocate 2D array of size MXN ?. How to traverse that array in CUDA?. Give me a sample code. .......................................................

相关标签:
2条回答
  • 2020-11-27 03:24

    The best way would be storing a two-dimensional array A in its vector form. For example you have a matrix A size nxm, and it's (i,j) element in pointer to pointer representation will be

    A[i][j] (with i=0..n-1 and j=0..m-1). 
    

    In a vector form you can write

    A[i*n+j] (with i=0..n-1 and j=0..m-1).
    

    Using one-dimensional array in this case will simplify the copy process, which would be simple:

    double *A,*dev_A; //A-hous pointer, dev_A - device pointer;
    A=(double*)malloc(n*m*sizeof(double));
    cudaMalloc((void**)&dev_A,n*m*sizeof(double));
    cudaMemcpy(&dev_A,&A,n*m*sizeof(double),cudaMemcpyHostToDevice); //In case if A is double
    
    0 讨论(0)
  • 2020-11-27 03:46

    How to allocate 2D array:

    int main(){
    #define BLOCK_SIZE 16
    #define GRID_SIZE 1
    int d_A[BLOCK_SIZE][BLOCK_SIZE];
    int d_B[BLOCK_SIZE][BLOCK_SIZE];
    
    /* d_A initialization */
    
    dim3 dimBlock(BLOCK_SIZE, BLOCK_SIZE); // so your threads are BLOCK_SIZE*BLOCK_SIZE, 256 in this case
    dim3 dimGrid(GRID_SIZE, GRID_SIZE); // 1*1 blocks in a grid
    
    YourKernel<<<dimGrid, dimBlock>>>(d_A,d_B); //Kernel invocation
    }
    

    How to traverse that array:

    __global__ void YourKernel(int d_A[BLOCK_SIZE][BLOCK_SIZE], int d_B[BLOCK_SIZE][BLOCK_SIZE]){
    int row = blockIdx.y * blockDim.y + threadIdx.y;
    int col = blockIdx.x * blockDim.x + threadIdx.x;
    if (row >= h || col >= w)return;
    /* whatever you wanna do with d_A[][] and d_B[][] */
    }
    

    i hope this is helpful

    and also you can refer to CUDA Programming Guide page 22 about Matrix Multiplication

    0 讨论(0)
提交回复
热议问题