get wrong result when caculating on GPU (python3.5+numba+CUDA8.0)

你离开我真会死。 提交于 2019-12-13 10:09:30

问题


I want to get the sum of different parts of an array. I run my code. and find two problems from what was printed.

pro1:

Described in detail here. It has been solved. Maybe it's not a real problem.

pro2:

In my code, I gived different value to sbuf[0,2], sbuf[1,2], sbuf[2,2] and sbuf[0,3], sbuf[1,3], sbuf[2,3].

But find that after cuda.syncthreads(), the values bacame same between sbuf[0,2] and sbuf[0,3], sbuf[1,2] and sbuf[1,3], sbuf[2,2] and sbuf[2,3].

It directly lead to the values of Xi_s, Xi1_s and Yi_s wrong.

These are my guesses according to what was printed inside the kernel.

@talonmies said relying on print statements inside kernels like this is dangerous.

So I want to know if it has an useful way to debug my code instead of printing statements inside kernels.

    ...

@cuda.jit
def calcu_T(D, T):
  ...

                    if bx==1 and tx==1:
                        print('5,c_x,c_y,L,c_index,bx,tx,ty,sbuf[0,ty],sbuf[1,ty],sbuf[2,ty],',c_x,',',c_y,',',L,',',c_index,',',bx,',',tx,',',ty,',',sbuf[0,ty],',',sbuf[1,ty],',',sbuf[2,ty])

                    cuda.syncthreads()

                    if bx==1 and tx==1:
                        print('1,c_x,c_y,L,c_index,bx,tx,ty,sbuf[0,ty],sbuf[1,ty],sbuf[2,ty],',c_x,',',c_y,',',L,',',c_index,',',bx,',',tx,',',ty,',',sbuf[0,ty],',',sbuf[1,ty],',',sbuf[2,ty])

                     ...

回答1:


As @talonmies said, printing statements inside kernels is not a good choice for debugging. If someone has the same problem, this documentation is helpful, and more you should learn is pdb, especially the debugger commands,such as 'p', 'c'.



来源:https://stackoverflow.com/questions/43153221/get-wrong-result-when-caculating-on-gpu-python3-5numbacuda8-0

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!