How to use C++ templates in OpenCL kernels?

前端 未结 6 1704
猫巷女王i
猫巷女王i 2020-12-24 08:53

I\'m a novice in OpenCL.

I have an algorithm which uses templates. It worked well with OpenMP parallelization but now the amount of data has grown and the only way t

6条回答
  •  醉梦人生
    2020-12-24 09:20

    There is an old way to emulate templates in pure C language. It is based on including a single file several times (without include guard). Since OpenCL has fully functional preprocessor and allows including files, this trick can be used.

    Here is a good explanation: http://arnold.uthar.net/index.php?n=Work.TemplatesC

    It is still much messier than C++ templates: the code has to be splitted into several parts, and you have to explicitly instantiate each instance of template. Also, it seems that you cannot do some useful things like implementing factorial as a recursive template.

    Code example

    Let's apply the idea to OpenCL. Suppose that we want to calculate inverse square root by Newton-Raphson iteration (generally not a good idea). However, the floating point type and the number of iterations may vary.

    First of all, we need a helper header ("templates.h"):

    #ifndef TEMPLATES_H_
    #define TEMPLATES_H_
    
    #define CAT(X,Y,Z) X##_##Y##_##Z   //concatenate words
    #define TEMPLATE(X,Y,Z) CAT(X,Y,Z)
    
    #endif
    

    Then, we write template function in "NewtonRaphsonRsqrt.cl":

    #include "templates.h"
    
    real TEMPLATE(NewtonRaphsonRsqrt, real, iters) (real x, real a) {
        int i;
        for (i = 0; i

    In your main .cl file, instantiate this template as follows:

    #define real float
    #define iters 2
    #include "NewtonRaphsonRsqrt.cl"  //defining NewtonRaphsonRsqrt_float_2
    
    #define real double
    #define iters 3
    #include "NewtonRaphsonRsqrt.cl"  //defining NewtonRaphsonRsqrt_double_3
    
    #define real double
    #define iters 4
    #include "NewtonRaphsonRsqrt.cl"  //defining NewtonRaphsonRsqrt_double_4
    

    And then can use it like this:

    double prec = TEMPLATE(NewtonRaphsonRsqrt, double, 4) (1.5, 0.5);
    float approx = TEMPLATE(NewtonRaphsonRsqrt, float, 2) (1.5, 0.5);
    

提交回复
热议问题