CUDA thread execution order

后端未结

关注

 2  1540

I have the following code for a CUDA program:

#include 

#define NUM_BLOCKS 4
#define THREADS_PER_BLOCK 4

__global__ void hello()
{  

   printf(


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  不要未来只要你来        
                
              
                            
                2021-01-23 07:26
              
            
            
                                                                       

  However, the thread order in every block is always 0,1,2,3. Why is this happening? I thought it would be random too


With 4 threads per block you are only launching one warp per block.  A warp is the unit of execution (and scheduling, and resource assignment) in CUDA, not a thread.  Currently, a warp consists of 32 threads.

This means that all 4 of your threads per block (since there is no conditional behavior in this case)  are executing in lockstep.  When they reach the printf function call, they all execute the call to that function in the same line of code, in lockstep.

So the question becomes, in this situation, how does the CUDA runtime dispatch these "simultaneous" function calls?  The answer to that question is unspecified, but it is not "random".  Therefore it's reasonable that the order of dispatch for operations within a warp does not change from run to run.

If you launch enough threads to create multiple warps per block, and probably also include some other code to disperse and or "randomize" the behavior between warps, you should be able to see printf operations emanating from separate warps occurring in "random" order.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  迷失自我        
                
              
                            
                2021-01-23 07:41
              
            
            
                                                                       
To answer the second part of your question, when control flow diverges at the if statement, the threads where threadIdx.x != 0 simply wait to at the convergence point after the if statement. They do not go on to the printf statement until thread 0 has completed the if block.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复