cuda-gdb Error message

坚强是说给别人听的谎言 提交于 2019-12-20 06:01:17

问题


I tried to debug my CUDA application with cuda-gdb but got some weird error.

I set option -g -G -O0 to build my application. I could run my program without cuda-gdb, but didn't get correct result. Hence I decided to use cuda-gdb, however, I got following error message while running program with cuda-gdb

Error: Failed to read the valid warps mask (dev=1, sm=0, error=16).

What does it means? Why sm=0 and what's the meaning of error=16?

Update 1: I tried to use cuda-gdb to CUDA samples, but it fails with same problem. I just installed CUDA 6.0 Toolkit followed by instruction of NVIDIA. Is it a problem of my system?

Update 2:

  • OS - CentOS 6.5
  • GPU
    • 1 Quadro 400
    • 2 Tesla C2070
    • I'm using only 1 GPU for my program, but I've got same bug message from any GPU that I selected
  • CUDA version - 6.0
  • GPU Driver
    • NVRM version: NVIDIA UNIX x86_64 Kernel Module 331.62 Wed Mar 19 18:20:03 PDT 2014
    • GCC version: gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)

Update 3: I tried to get more information in cuda-gdb, but I got following results

(cuda-gdb) info cuda devices Error: Failed to read the valid warps mask (dev=1, sm=0, error=16). (cuda-gdb) info cuda sms Focus not set on any active CUDA kernel. (cuda-gdb) info cuda lanes Focus not set on any active CUDA kernel. (cuda-gdb) info cuda kernels No CUDA kernels. (cuda-gdb) info cuda contexts No CUDA contexts.


回答1:


Actually, this issue is only specific to some old NVIDIA GPUs(like "Quadro 400", "GeForce GT220", or "GeForce GT 330M", etc).

On Liam Kim's setup, cuda-gdb should work fine by set environment variable "CUDA_VISIBLE_DEVICES", and let cuda-gdb running on Tesla C2070 GPUs specifically. I.e $export CUDA_VISIBLE_DEVICES=0 (or 2) - the exact CUDA devices index could be found by running cuda sample - "deviceQuery".

And now, this issue has been fixed, the fix would be availble for CUDA developers in the next CUDA release(it will be posted out around early July, 2014).




回答2:


This is internal cuda-gdb bug. You should report a bug.

Can you try installing CUDA toolkit from the package on NVIDIA site?



来源:https://stackoverflow.com/questions/23997350/cuda-gdb-error-message

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!