Threads synchronization. How exactly lock makes access to memory 'correct'?

后端未结

关注

 5  1103

清歌不尽 2020-12-29 09:41

First of all, I know that lock{} is synthetic sugar for Monitor class. (oh, syntactic sugar)

I was playing with simple mul

5条回答

被撕碎了的回忆 (楼主)

2020-12-29 10:34

You don't say how many threads you used, but I am guessing two - if you ran with four threads, I'd expect the unlocked version to wind up with a result that's reasonably close to 1/4 of the single-threaded version 'correct' result.

When you don't use lock, your quad-proc machine allocates a thread to each CPU (this statement discounts the presence of other apps that will also get scheduled in turn, for simplicity) and they run at full speed, free of interference with each other. Each thread gets the value from memory, increments it and stores it back to memory. The result overwrites what's there, which means that, since you have 2 (or 3, or 4) threads running at full speed at the same time, some of the increments made by threads on your other cores effectively get thrown away. Your final result is thus lower than what you got from a single thread.

When you add the lock statement, this tells the CLR (this looks like C#?) to ensure that only one thread, on any available core, can execute that code. This is a critical change from the situation above, since the multiple threads now interfere with each other, even though as you realise this code is not thread-safe (just close enough to that to be dangerous). This incorrect serialization results (as a side effect) in the ensuing increment being executed concurrently less often - since the implied unlock requires an expensive, in the terms of this code and your multi-core CPU, at least, awakening of any threads that were waiting for the lock. This multi-threaded version will also run slower than the single-threaded version because of this overhead. Threads do not always make code faster.

While any waiting threads are waking up from their wait state, the lock-releasing thread can continue running in its time-slice, and often will get, increment and store the variable before the awakening threads get a chance to take a copy of the variable from memory for their own increment op. Thus you wind up with a final value that's close to the single-threaded version, or what you'd get if you lock-ed the increment inside the loop.

Check out the Interlocked class for a hardware-level way to treat variables of a certain type atomically.

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...