Spinlock with XCHG unlocking

前端未结

关注

 2  1249

The example implementation Wikipedia provides for a spinlock with the x86 XCHG command is:

; Intel syntax

locked:                      ; The lock variable.


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  太阳男子        
                
              
                            
                2020-12-03 20:06
              
            
            
                                                                       
the point of 

spin_unlock:
     mov     eax, 0          ; Set the EAX register to 0.

     xchg    eax, [locked]   ; Atomically swap the EAX register with
                             ;  the lock variable.

     ret                     ; The lock has been released.


is not only to unlock, but also to fill eax with the correct return value ( 1 if unlocked, 0 otherwise)

if the lock was not obtained before calling spin_unlock ( [locked] holds the value 0 in this case), spin_unlock should return 0
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  余生分开走        
                
              
                            
                2020-12-03 20:09
              
            
            
                                                                       
The unlock does need to have release semantics to properly protect the critical section.  But it doesn't need sequential-consistency.  Atomicity isn't really the issue (see below).

So yes, on x86 a simple store is safe, and glibc's pthread_spin_unlock does so:: 

    movl    $1, (%rdi)
    xorl    %eax, %eax
    retq


See also a simple but maybe usable x86 spinlock implementation I wrote in this answer, using a read-only spin loop with a pause instruction.



Possibly this code was adapted from a bit-field version.

Unlocking with btr to zero one flag in a bitfield isn't safe, because it's a non-atomic read-modify-write of the containing byte (or the containing naturally-aligned 4 byte dword or 2 byte word).

So maybe whoever wrote it didn't realize that simple stores to aligned addresses are atomic on x86, like on most ISAs.  But what x86 has that weakly-ordered ISAs don't is that every store has release semantics.  An xchg to release the lock makes every unlock a full memory barrier, which goes beyond normal locking semantics.  (Although on x86, taking a lock will be a full barrirer, because there's no way to do an atomic RMW or atomic compare-and-swap without an xchg or other locked instruction, and those are full barriers like mfence.)

The unlocking store doesn't technically need to be atomic, since we only ever store zero or 1, so only the lower byte matters.  e.g. I think it would still work if the lock was unaligned and split across a cache-line boundary.  Tearing can happen but doesn't matter, and what's really happening is that the low byte of the lock is modified atomically, with operations that always put zeros into the upper 3 bytes.



If you wanted to return the old value to catch double-unlocking bugs, a better implementation would separately load and store:

spin_unlock:
     ;; pre-condition: [locked] is non-zero

     mov     eax,  [locked]        ; old value, for debugging
     mov     dword [locked], 0     ; On x86, this is an atomic store with "release" semantics.

     ;test    eax,eax
     ;jz    double_unlocking_detected    ; or leave this to the caller
     ret

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复