numba cuda does not produce correct result with += (gpu reduction needed?)
问题 I am using numba cuda to calculate a function. The code is simply to add up all the values into one result, but numba cuda gives me a different result from numpy. numba code import math def numba_example(number_of_maximum_loop,gs,ts,bs): from numba import cuda result = cuda.device_array([3,]) @cuda.jit(device=True) def BesselJ0(x): return math.sqrt(2/math.pi/x) @cuda.jit def cuda_kernel(number_of_maximum_loop,result,gs,ts,bs): i = cuda.grid(1) if i < number_of_maximum_loop: result[0] +=