Thread mutex behaviour | 易学教程

问题

I'm learning C. I'm writing an application with multiple threads; I know that when a variable is shared between two or more threads, it is better to lock/unlock using a mutex to avoid deadlock and inconsistency of variables. This is very clear when I want to change or view one variable.

int i = 0; /** Global */
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

/** Thread 1. */
pthread_mutex_lock(&mutex);
i++;
pthread_mutex_unlock(&mutex);

/** Thread 2. */
pthread_mutex_lock(&mutex);
i++;
pthread_mutex_unlock(&mutex);

This is correct, I think. The variable i, at the end of the executions, contains the integer 2.
Anyway, there are some situations in which I don't know exactly where to put the two function calls.

For example, suppose you have a function obtain(), which returns a global variable. I need to call that function from within the two threads. I have also two other threads that call the function set(), defined with a few arguments; this function will set the same global variable. The two functions are necessary when you need to do something before getting/setting the var.

/** (0) */
/** Thread 1, or 2, or 3... */
if(obtain() == something) {

    if(obtain() == somethingElse) {
        // Do this, sometimes obtain() and sometimes set(random number) (1)   
    } else {
        // Do that, just obtain(). (2)
    }

} else {
    // Do this and do that (3)
    // If # of thread * 3 > 10, then set(3*10) For example. (4)
}
/** (5) */

Where I have to lock, and where I have to unlock? The situation can be, I think, even more complex. I will appreciate an exhaustive answer.

Thank you in advance.
—Alberto

回答1:

Without any protection:

The Operating System may interrupt each of your threads anytime, and give the processor to the other. "Anytime" includes "between two assembly instructions that actually come from the same C command".

Now, suppose your variable occupies 64 bits in a 32-bit processor. That means your variable occupies two processor "words". In order to write it, the processor needs two assembly instructions. Same for reading. If the thread gets interrupted between the two, you get trouble.

In order to give a more clear example, I will use the analogy of two decimal digits to represent the two binary 32-bit words. So say you are incrementing a two-digit decimal number in a 1-digit processor. To increment 19 to 20, you must read 19, do the math, then write 20. In order to write 20, you must write 2 then write 0 (or vice-versa). If you write 2, then get interrupted before writing 0, the number in memory will be 29, far from what would actually be right. The other thread then proceeds to read the wrong number.

Even if you have a single digit, there's still the read-modify-write issue Blank Xavier explained.

With mutex:

When thread A locks the mutex, thread A checks a mutex variable. If it's free, thread A writes it as taken. It does it using an atomic instruction, one assembly instruction, so there is no "in between" to interrupt. It then proceeds to increment 19 to 20. It can still be interrupted during the incorrect 29 variable value, but it's OK, because now nobody else can access the variable. When thread B tries to lock the mutex, it checks the mutex variable, it is taken. So thread B knows it can't touch the variable. It then calls the Operating System, saying "I give up the processor for now". Thread B will repeat that if it gets the processor again. And again. Until thread A finally gets the processor back, finishes what it was doing, then unlocks the mutex.

So, when to lock?

As so many things, it depends. Mostly on the specific behaviour order your application needs to work correctly. You need to always lock before reading or writing to get the protection, then unlock afterwards. But the "locked block of code" may have many commands, or a single one. Keep the dance explained above in mind and think about how your application should behave.

There are also performance issues. If you lock/unlock around every single line of code, you waste time locking/unlocking. If you lock/unlock only around huge blocks of code, then each thread will wait a long time for the other to release the mutex.

Not really "always"

Now, there are some situations in which you may skip the locking-unlocking. They happen when you are dealing with a one-digit (meaning one processor word) variable, and each thread is either only reading it, or only writing it, so the value read will not determine what value to write to it later. Do this only if you are very sure of what you are doing, and really need the performance increase.

回答2:

Some words of explanation.

In the example code, a single variable is being incremented.

Now, memory and CPU caches are organised such that whenever memory is accessed, it is accessed in a cache-lines worth of data at a time. This is because memory is very slow to begin accessing, but then relatively quick to continue accessing, and because it is often the case that when one bit is accessed, a fair number of following bits will be accessed.

So, we read in our integer. Let's say the integer is 8 bytes in length and the cache line is also 8 bytes (e.g. modern 64-bit Intel CPU). The read is necessary in this case since we need to know the original value. So, the read occurs and the cache line enters L3, L2 and L1 cache (Intel uses an inclusive cache; everything in L1 is present in L2, everything in L2 is present in L3, etc).

Now, when you have multiple CPUs, they keep an eye on what the others are doing, because if another CPU writes to a cache line you have in your cache, your copy isn't correct any more.

If we have one CPU with this cache line its in cache and it increments the value, any other CPU with a copy of this value will have its copy marked invalid.

So imagine we have two threads, on different CPUs. They both read in the integer. At this point, their caches mark this cache line as shared. Then one of them writes to it. The writer will have his cache line marked as modified, the second CPU has his cache line invalidated - and so when he comes to try to write, what then happens is he tries again to read the integer from memory, but since there exists in another CPUs cache a modified copy, he grabs a copy of the modified value from the first CPU, the first CPU has his copy marked invalid, and now the second CPU writes his own new value.

So, all seems well so far - how can it be that we need locking?

The problem is this; one CPU reads the value in its cache - then the other does the same. The cache line is currently marked shared, so this is fine. They both then increment. One of them will write back and so his cache line becomes exclusive while this cache line for all other CPUs is marked invalid. The second CPU then writes back, which causes it to take a copy of the cache line from the current owner and then modifies it - writing back the same value.

As such, one of the increments was lost.

回答3:

if you have an obtain() function, there should be a release() function, right? then do the lock in the obtain() and unlock in release().

回答4:

You have to hold a lock around the entire operation that should be atomic - that is, the block that should be executed as one indivisible operation.

来源：https://stackoverflow.com/questions/4718727/thread-mutex-behaviour

标签

mutex