How to use C++ templates in OpenCL kernels?

前端未结

关注

 6  1704

猫巷女王i 2020-12-24 08:53

I\'m a novice in OpenCL.

I have an algorithm which uses templates. It worked well with OpenMP parallelization but now the amount of data has grown and the only way t

6条回答

醉梦人生 (楼主)

2020-12-24 09:20
There is an old way to emulate templates in pure C language. It is based on including a single file several times (without include guard). Since OpenCL has fully functional preprocessor and allows including files, this trick can be used.

Here is a good explanation: http://arnold.uthar.net/index.php?n=Work.TemplatesC

It is still much messier than C++ templates: the code has to be splitted into several parts, and you have to explicitly instantiate each instance of template. Also, it seems that you cannot do some useful things like implementing factorial as a recursive template.

Code example

Let's apply the idea to OpenCL. Suppose that we want to calculate inverse square root by Newton-Raphson iteration (generally not a good idea). However, the floating point type and the number of iterations may vary.

First of all, we need a helper header ("templates.h"):
```
#ifndef TEMPLATES_H_
#define TEMPLATES_H_

#define CAT(X,Y,Z) X##_##Y##_##Z   //concatenate words
#define TEMPLATE(X,Y,Z) CAT(X,Y,Z)

#endif
```
Then, we write template function in "NewtonRaphsonRsqrt.cl":
```
#include "templates.h"

real TEMPLATE(NewtonRaphsonRsqrt, real, iters) (real x, real a) {
    int i;
    for (i = 0; i
```
In your main .cl file, instantiate this template as follows: #define real float #define iters 2 #include "NewtonRaphsonRsqrt.cl" //defining NewtonRaphsonRsqrt_float_2 #define real double #define iters 3 #include "NewtonRaphsonRsqrt.cl" //defining NewtonRaphsonRsqrt_double_3 #define real double #define iters 4 #include "NewtonRaphsonRsqrt.cl" //defining NewtonRaphsonRsqrt_double_4 And then can use it like this: double prec = TEMPLATE(NewtonRaphsonRsqrt, double, 4) (1.5, 0.5); float approx = TEMPLATE(NewtonRaphsonRsqrt, float, 2) (1.5, 0.5);
0 讨论(0) 查看其它6个回答发布评论: 提交评论加载中...

How to use C++ templates in OpenCL kernels?

Code example