CUDA: Is texture memory still useful to speed up access times for compute capability 2.x and newer?

后端 未结 1 1202
遇见更好的自我
遇见更好的自我 2021-02-14 02:45

I\'m writing an image processing app where I have to fetch pixel data in uncoalesced manner.

Initially I implemented my algorithm using global memory. Later I reimpleme

相关标签:
1条回答
  • 2021-02-14 03:28

    Textures can indeed be useful on devices of compute capability >= 2.0.

    Textures and cudaArrays can use memory stored in a space filling curve, which can allow for a better cache hit rate due to better 2D spatial locality.

    The texture cache is separate from the other caches. So it has its own dedicated memory and bandwidth and reading from it does not interfere with the other caches. This can become important if there is a lot of pressure on your L1/L2 caches.

    Textures also provide built in functionality such as interpolation, various addressing modes (clamp, wrap, mirror) and normalized addressing with floating point coordinates. These can be used without any extra cost and can greatly improve performance in kernels where such functionality is needed.

    On early CUDA architectures, textures and cudaArrays could not be written by a kernel. On architectures of compute capability >= 2.0, they can be written via CUDA surfaces.

    Determining if you should use textures or a regular buffer in global memory comes down to the intended usage and access patterns for the memory. It will be project specific.

    You are using the Fermi architecture, with a device that has been rebranded into the 6xx series.

    For those on the Kepler architecture, take a look at NVIDIA's Inside Kepler Presentation. In particular, the slides, Texture Performance, Texture Cache Unlocked and const __restrict Example.

    0 讨论(0)
提交回复
热议问题