I am porting some physics simulation code from C++ to CUDA.
The fundamental algorithm can be understood as: applying an operator to each element of a vector. In pseu
Something like this perhaps...
template class Composition {...} template Composition compose(Op1& op1, Op2& op2) {...} template void apply(C& c, VecType& vec){...}