I am having trouble fetching a texture of floats. The texture is defined as follows:
texture<float, 2, cudaReadModeElementType> cornerTexture;
The binding and parameter settings are:
cornerTexture.addressMode[0] = cudaAddressModeClamp; cornerTexture.addressMode[1] = cudaAddressModeClamp; cornerTexture.filterMode = cudaFilterModePoint; cornerTexture.normalized = false; cudaChannelFormatDesc cornerDescription = cudaCreateChannelDesc<float>(); cudaBindTexture2D(0, &cornerTexture, cornerImage->imageData_device, &cornerDescription, cornerImage->width, cornerImage->height, cornerImage->widthStep);
height
and width
are the sizes of the two dimensions in terms of numbers of elements. widthStep
is in terms of number of bytes. In-kernel access occurs as follows:
thisValue = tex2D(cornerTexture, thisPixel.x, thisPixel.y); printf("thisPixel.x: %i thisPixel.y: %i thisValue: %f\n", thisPixel.x, thisPixel.y, thisValue);
thisValue
should always be a non-negative float. printf()
is giving me strange, useless values that are different from what the linear memory actually stores. I have tried offsetting the access with a 0.5f
on both coordinates, but it gives me the same wrong results.
Any ideas?
Update There seems to be a hidden alignment requirement. From what I can deduce, the pitch passed to the cudaBindTexture
function needs to be a multiple of 32 bytes. For example, the following gives incorrect results
cudaBindTexture2D(0, &debugTexture, deviceFloats, &debugDescription, 10, 32, 40)
when fetching the texture, but the following (the same array with its width and height switched) works well:
cudaBindTexture2D(0, &debugTexture, deviceFloats, &debugDescription, 32, 10, 128)
I'm not sure whether I'm missing something or there really is a constraint on the pitch.
Update 2: I have filed a bug report with Nvidia. Those who are interested can view it in their developer zone, but I will post the reply back here.