PyCUDA: Pow within device code tries to use std::pow, fails

前端 未结 1 1388
我在风中等你
我在风中等你 2021-01-18 19:15

Question more or less says it all.

calling a host function(\"std::pow \") from a __device__/__global__ function(\"_calc_psd\") is not allowe         


        
相关标签:
1条回答
  • 2021-01-18 20:00

    The error is exactly as the compiler is reported. You can't used host functions in device code, and that include the whole host C++ std library. CUDA includes its own standard library, described in the programming guide, but you should use either pow or fpow (taken from the C standard library, no C++ or namespaces). nvcc will overload the function with the cuda correct device function and inline the resulting code. Something like the following will work:

    #include <math.h>
    
    __device__ float func(float x) {
    
       return x * x * fpow(x, 0.123456f);
    }
    

    EDIT: The bit I missed the first time is the template specifier reported in the errors. Are you sure that you are passing either float or double arguments to pow? If you are passing integers, there is no overload function in the CUDA standard library, which is why it might be failing. If you need an integer pow function, you will have to roll your own (or do casting, but pow is a rather expensive function and I am certain some cascaded integer multiplication will be faster).

    0 讨论(0)
提交回复
热议问题