Does __syncthreads() synchronize all threads in the grid?

前端 未结 5 1058
栀梦
栀梦 2021-02-02 05:43

...or just the threads in the current warp or block?

Also, when the threads in a particular block encounter (in the kernel) the following line

__shared__         


        
5条回答
  •  一整个雨季
    2021-02-02 06:32

    I agree with all the answers here but I think we are missing one important point here w.r.t first question. I am not answering second answer as it got answered perfectly in the above answers.

    Execution on GPU happens in units of warps. A warp is a group of 32 threads and at one time instance each thread of a particular warp execute the same instruction. If you allocate 128 threads in a block its (128/32 = ) 4 warps for a GPU.

    Now the question becomes "If all threads are executing the same instruction then why synchronization is needed?". The answer is we need to synchronize the warps that belong to the SAME block. __syncthreads does not synchronizes threads in a warp, they are already synchronized. It synchronizes warps that belong to same block.

    That is why answer to your question is : __syncthreads does not synchronizes all threads in a grid, but the threads belonging to one block as each block executes independently.

    If you want to synchronize a grid then divide your kernel (K) into two kernels(K1 and K2) and call both. They will be synchronized (K2 will be executed after K1 finishes).

提交回复
热议问题