When should the volatile keyword be used in C#?

前端 未结 10 849
盖世英雄少女心
盖世英雄少女心 2020-11-22 12:04

Can anyone provide a good explanation of the volatile keyword in C#? Which problems does it solve and which it doesn\'t? In which cases will it save me the use of locking?

相关标签:
10条回答
  • 2020-11-22 12:32

    I found this article by Joydip Kanjilal very helpful!

    When you mark an object or a variable as volatile, it becomes a candidate for volatile reads and writes. It should be noted that in C# all memory writes are volatile irrespective of whether you are writing data to a volatile or a non-volatile object. However, the ambiguity happens when you are reading data. When you are reading data that is non-volatile, the executing thread may or may not always get the latest value. If the object is volatile, the thread always gets the most up-to-date value

    I'll just leave it here for reference

    0 讨论(0)
  • 2020-11-22 12:33

    From MSDN: The volatile modifier is usually used for a field that is accessed by multiple threads without using the lock statement to serialize access. Using the volatile modifier ensures that one thread retrieves the most up-to-date value written by another thread.

    0 讨论(0)
  • 2020-11-22 12:36

    If you want to get slightly more technical about what the volatile keyword does, consider the following program (I'm using DevStudio 2005):

    #include <iostream>
    void main()
    {
      int j = 0;
      for (int i = 0 ; i < 100 ; ++i)
      {
        j += i;
      }
      for (volatile int i = 0 ; i < 100 ; ++i)
      {
        j += i;
      }
      std::cout << j;
    }
    

    Using the standard optimised (release) compiler settings, the compiler creates the following assembler (IA32):

    void main()
    {
    00401000  push        ecx  
      int j = 0;
    00401001  xor         ecx,ecx 
      for (int i = 0 ; i < 100 ; ++i)
    00401003  xor         eax,eax 
    00401005  mov         edx,1 
    0040100A  lea         ebx,[ebx] 
      {
        j += i;
    00401010  add         ecx,eax 
    00401012  add         eax,edx 
    00401014  cmp         eax,64h 
    00401017  jl          main+10h (401010h) 
      }
      for (volatile int i = 0 ; i < 100 ; ++i)
    00401019  mov         dword ptr [esp],0 
    00401020  mov         eax,dword ptr [esp] 
    00401023  cmp         eax,64h 
    00401026  jge         main+3Eh (40103Eh) 
    00401028  jmp         main+30h (401030h) 
    0040102A  lea         ebx,[ebx] 
      {
        j += i;
    00401030  add         ecx,dword ptr [esp] 
    00401033  add         dword ptr [esp],edx 
    00401036  mov         eax,dword ptr [esp] 
    00401039  cmp         eax,64h 
    0040103C  jl          main+30h (401030h) 
      }
      std::cout << j;
    0040103E  push        ecx  
    0040103F  mov         ecx,dword ptr [__imp_std::cout (40203Ch)] 
    00401045  call        dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (402038h)] 
    }
    0040104B  xor         eax,eax 
    0040104D  pop         ecx  
    0040104E  ret              
    

    Looking at the output, the compiler has decided to use the ecx register to store the value of the j variable. For the non-volatile loop (the first) the compiler has assigned i to the eax register. Fairly straightforward. There are a couple of interesting bits though - the lea ebx,[ebx] instruction is effectively a multibyte nop instruction so that the loop jumps to a 16 byte aligned memory address. The other is the use of edx to increment the loop counter instead of using an inc eax instruction. The add reg,reg instruction has lower latency on a few IA32 cores compared to the inc reg instruction, but never has higher latency.

    Now for the loop with the volatile loop counter. The counter is stored at [esp] and the volatile keyword tells the compiler the value should always be read from/written to memory and never assigned to a register. The compiler even goes so far as to not do a load/increment/store as three distinct steps (load eax, inc eax, save eax) when updating the counter value, instead the memory is directly modified in a single instruction (an add mem,reg). The way the code has been created ensures the value of the loop counter is always up-to-date within the context of a single CPU core. No operation on the data can result in corruption or data loss (hence not using the load/inc/store since the value can change during the inc thus being lost on the store). Since interrupts can only be serviced once the current instruction has completed, the data can never be corrupted, even with unaligned memory.

    Once you introduce a second CPU to the system, the volatile keyword won't guard against the data being updated by another CPU at the same time. In the above example, you would need the data to be unaligned to get a potential corruption. The volatile keyword won't prevent potential corruption if the data cannot be handled atomically, for example, if the loop counter was of type long long (64 bits) then it would require two 32 bit operations to update the value, in the middle of which an interrupt can occur and change the data.

    So, the volatile keyword is only good for aligned data which is less than or equal to the size of the native registers such that operations are always atomic.

    The volatile keyword was conceived to be used with IO operations where the IO would be constantly changing but had a constant address, such as a memory mapped UART device, and the compiler shouldn't keep reusing the first value read from the address.

    If you're handling large data or have multiple CPUs then you'll need a higher level (OS) locking system to handle the data access properly.

    0 讨论(0)
  • 2020-11-22 12:37

    The CLR likes to optimize instructions, so when you access a field in code it might not always access the current value of the field (it might be from the stack, etc). Marking a field as volatile ensures that the current value of the field is accessed by the instruction. This is useful when the value can be modified (in a non-locking scenario) by a concurrent thread in your program or some other code running in the operating system.

    You obviously lose some optimization, but it does keep the code more simple.

    0 讨论(0)
  • 2020-11-22 12:38

    Sometimes, the compiler will optimize a field and use a register to store it. If thread 1 does a write to the field and another thread accesses it, since the update was stored in a register (and not memory), the 2nd thread would get stale data.

    You can think of the volatile keyword as saying to the compiler "I want you to store this value in memory". This guarantees that the 2nd thread retrieves the latest value.

    0 讨论(0)
  • 2020-11-22 12:38

    The compiler sometimes changes the order of statements in code to optimize it. Normally this is not a problem in single-threaded environment, but it might be an issue in multi-threaded environment. See following example:

     private static int _flag = 0;
     private static int _value = 0;
    
     var t1 = Task.Run(() =>
     {
         _value = 10; /* compiler could switch these lines */
         _flag = 5;
     });
    
     var t2 = Task.Run(() =>
     {
         if (_flag == 5)
         {
             Console.WriteLine("Value: {0}", _value);
         }
     });
    

    If you run t1 and t2, you would expect no output or "Value: 10" as the result. It could be that the compiler switches line inside t1 function. If t2 then executes, it could be that _flag has value of 5, but _value has 0. So expected logic could be broken.

    To fix this you can use volatile keyword that you can apply to the field. This statement disables the compiler optimizations so you can force the correct order in you code.

    private static volatile int _flag = 0;
    

    You should use volatile only if you really need it, because it disables certain compiler optimizations, it will hurt performance. It's also not supported by all .NET languages (Visual Basic doesn't support it), so it hinders language interoperability.

    0 讨论(0)
提交回复
热议问题