Parallelize function which will count all vectors with sum equal of vector elements and elements not bigger of k

前端 未结 3 1522
旧巷少年郎
旧巷少年郎 2021-01-28 18:16

I want to parallelize a function in CUDA C which will count all vectors with sum equal of vector elements and elements not bigger than k. For example if the number of vector ele

3条回答
  •  北海茫月
    2021-01-28 18:29

    The problem is __syncthreads(). For a __syncthreads() to work properly, all the threads inside the block should be able to reach it otherwise some threads wait forever and your program doesn't get out. In your program, execution of __syncthreads() in some parts is conditional. That's the reason why your program doesn't work with more than one thread in one block.

提交回复
热议问题