Varying results from cuBlas

后端 未结 1 880
温柔的废话
温柔的废话 2021-01-24 18:56

I have implemented the following CUDA code but i am a little bit confused about the behavior.

#include 
#include 
#include 

        
1条回答
  •  粉色の甜心
    2021-01-24 19:16

    Regarding the second part of the question, following njuffa's remark, you may change the settings for driver behavior to avoid the error when increasing size. Open NSIGHT Monitor and in Options, General, Microsoft Display Driver, change to False the WDDM TDR enabled field.

    From spec, the 32bits FPU flops should be around 2.4 TFLOPS in single precision, hence your operation for a 16000 sized matrix should take at the minimum 3.5 seconds. Hence the Driver Recovery after 2 seconds.

    0 讨论(0)
提交回复
热议问题