The second copy looks broken to me. You have defined this array:
__constant__ int ints[160]; // 640 bytes
which as correctly noted is 640 bytes long.
Your second copy is like this:
cudaMemcpyToSymbol(ints,pFlts,640,1920,cudaMemcpyDeviceToDevice); // second copy
Which says, "copy a total of 640 bytes, from pFlts
array to ints
array, with the storage location in the ints
array beginning at 1920 bytes from the start of the array."
This won't work. The ints
array is only 640 bytes long. You can't pick as your destination a location that is 1920 bytes into it.
From the documentation for cudaMemcpyToSymbol :
offset- Offset from start of symbol in bytes
In this case the symbol is ints
Probably what you want is:
cudaMemcpyToSymbol(ints,pFlts+480,640,0,cudaMemcpyDeviceToDevice); // second copy
EDIT:
In response to the questions in the comments about error checking, I crafted this simple test program:
#include <stdio.h>
#define cudaCheckErrors(msg) \
do { \
cudaError_t __err = cudaGetLastError(); \
if (__err != cudaSuccess) { \
fprintf(stderr, "Fatal error: %s (%s at %s:%d)\n", \
msg, cudaGetErrorString(__err), \
__FILE__, __LINE__); \
fprintf(stderr, "*** FAILED - ABORTING\n"); \
exit(1); \
} \
} while (0)
__constant__ int ints[160];
int main(){
int *d_ints;
cudaError_t mystatus;
cudaMalloc((void **)&d_ints, sizeof(int)*160);
cudaCheckErrors("cudamalloc fail");
mystatus = cudaMemcpyToSymbol(ints, d_ints, 160*sizeof(int), 1920, cudaMemcpyDeviceToDevice);
if (mystatus != cudaSuccess) printf("returned value was not cudaSuccess\n");
cudaCheckErrors("cudamemcpytosymbol fail");
printf("OK!\n");
return 0;
}
When I compile and run this, I get the following output:
returned value was not cudaSuccess
Fatal error: cudamemcpytosymbol fail (invalid argument at t94.cu:26)
*** FAILED - ABORTING
This indicates that both the error return value from the cudaMemcpyToSymbol function call and the cudaGetLastError()
method return an error in this case. If I change the 1920 parameter to zero in this test case, the error goes away.