问题
here is a demo.cu
aiming to printf from the GPU device:
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <stdio.h>
__global__ void hello_cuda() {
printf("hello from GPU\n");
}
int main() {
printf("hello from CPU\n");
hello_cuda <<<1, 1>>> ();
cudaDeviceSynchronize();
cudaDeviceReset();
printf("bye bye from CPU\n");
return 0;
}
it compiles and runs:
$ nvcc demo.cu
$ ./a.out
that's the output that I get:
hello from CPU
bye bye from CPU
Q: why there is no printing result from the GPU?
it does seem as I misconfigured the cuda toolkit or something, however I'm able to compile and run various programs from the cuda-samples. For example, matrixMul, or deviceQuery
回答1:
If your device is of compute capability 3.0 or below, CUDA 11 dropped support for these GPUs. You'll need to use a prior CUDA version.
The CUDA compiler must compile for a GPU target (i.e. a device architecture). If you don't specify a target architecture on the compile command line, historically, CUDA has chosen a very flexible default architecture specification that can run on all GPUs that the CUDA version supports.
That isn't always the case, however, and its not the case with CUDA 11. CUDA 11 compiles for a default architecture of sm_52
(compute capability 5.2, i.e. as if you had specified -arch=sm_52
on the command line). But CUDA 11 supports architectures down to sm_35
(compute capability 3.5).
Therefore if you don't specify the target architecture on the compile command line with CUDA 11, and attempt to run on a GPU with an architecture that predates sm_52
, any CUDA code (kernels) that you have written definitely won't work.
It's good practice, any time you are having trouble with a CUDA code, to use proper CUDA error checking, and if you had done that here you would have gotten a runtime error indication that would have immediately identified the issue (at least for someone who is familiar with CUDA errors).
The solution in these cases is to specify a compilation command that includes the GPU you intend to run on (this is usually good practice anyway). If you do that, and the architecture you specify is "deprecated", then the nvcc
compiler will issue a warning letting you know that a future CUDA version may not support the GPU you are trying to run on. The warning does not mean anything you are doing is wrong or illegal or needs to be changed, but it means that in the future, a future CUDA version may not support that GPU.
If you want to suppress that warning, you can pass the -Wno-deprecated-gpu-targets
switch on the compile command line.
The same problem can occur on windows, of course. In that case, you'll need to modify the following VS project setting to match the architecture for your device:
来源:https://stackoverflow.com/questions/63675040/cuda-11-kernel-doesnt-run