I am confused about the difference between the intended use of device pointers and cudaArray
structures. Could someone please explain why I would use one versus the
cudaArray
is an opaque block of memory that is optimized for binding to textures. Textures can use memory stored in a space filling curve, which allows for a better texture cache hit rate due to better 2D spatial locality. Copying data to a cudaArray
will cause it to be formatted to such a curve.
So, storing data in a cudaArray
is an optimization technique which can yield better texture cache hit rates. On early CUDA architectures, the cudaArray
also cannot be accessed by a kernel. However, architectures of compute capability >= 2.0 can access the array via CUDA surfaces.
Determining if you should use a cudaArray
or a regular buffer in global memory comes down to the intended usage and access patterns for the memory. It will be project specific.
cudaMallocArray()
actually allocates a 2D array, so I think the issue is just inconsistent naming. Maybe it would have been more logical to call it cudaMallocArray2D()
.
I haven't used 3D textures. Hopefully, someone will answer and let us know why there's no need for cudaBindTexture3D()
.
You can use cudaBindTextureToArray, it works for both 2D and 3D.