curand

CUDA's Mersenne Twister for an arbitrary number of threads

两盒软妹~` 提交于 2019-12-07 22:58:41
问题 CUDA's implementation of the Mersenne Twister ( MT ) random number generator is limited to a maximal number of threads/blocks of 256 and 200 blocks/grid, i.e. the maximal number of threads is 51200 . Therefore, it is not possible to launch the kernel that uses the MT with kernel<<<blocksPerGrid, threadsPerBlock>>>(devMTGPStates, ...) where int blocksPerGrid = (n+threadsPerBlock-1)/threadsPerBlock; and n is the total number of threads. What is the best way to use the MT for threads > 51200 ?

CUDA's Mersenne Twister for an arbitrary number of threads

蓝咒 提交于 2019-12-06 16:00:15
CUDA's implementation of the Mersenne Twister ( MT ) random number generator is limited to a maximal number of threads/blocks of 256 and 200 blocks/grid, i.e. the maximal number of threads is 51200 . Therefore, it is not possible to launch the kernel that uses the MT with kernel<<<blocksPerGrid, threadsPerBlock>>>(devMTGPStates, ...) where int blocksPerGrid = (n+threadsPerBlock-1)/threadsPerBlock; and n is the total number of threads. What is the best way to use the MT for threads > 51200 ? My approach if to use constant values for blocksPerGrid and threadsPerBlock , e.g. <<<128,128>>> and use