I\'m currently studying CUDA programming and have been implementing histogram kernel functions. These kernels feature atomic operations such that each thread is able to incr