Profiling arbitrary CUDA applications
问题 I know of the existence of nvvp and nvprof , of course, but for various reasons nvprof does not want to work with my app that involves lots of shared libraries. nvidia-smi can hook into the driver to find out what's running, but I cannot find a nice way to get nvprof to attach to a running process. There is a flag --profile-all-processes which does actually give me a message "NVPROF is profiling process 12345", but nothing further prints out. I am using CUDA 8. How can I get a detailed