问题
I realized that "cuPrintf" can be used in the kernel, but "cudaPrintfDisplay" can only be used in the CPU code. This seems to me that the "cuPrintf" can only be flushed to stdout after returning from kernel. My question is: can we get the on-time print-out during the kernel running?
回答1:
As you have noticed, cuPrintf()
(and printf()
in compute capability >= 2.0), simply add strings to a buffer while the kernel is running, and the buffer is printed out after the kernel ends.
I don't think there is a way to get real time printf from a kernel. But, to get less delay, you may be able to run the kernel with fewer threads each time. Since __device__ printf()
is only a diagnostics or debugging tool, any loss in performance shouldn't matter.
Maybe the best thing would be to run your code in a CUDA debugger and get immediate feedback that way.
来源:https://stackoverflow.com/questions/12589380/can-we-get-the-on-time-print-out-during-the-kernel-running