Im calling the kernel with dim3(PIXEL_SIZE, PIXEL_SIZE, CHANNEL) block dimension, which means PIXEL_SIZE * PIXEL_SIZE * CHANNEL count of threads:
dim3(PIXEL_SIZE, PIXEL_SIZE, CHANNEL)
p