My recent efforts to implement a thread/ mutex manager ended up in an 75% CPU load (4 core), while all four running threads were either in sleep or waiting for a mutex beein
First I want to thank for all answers.
During the work on an example code, that reproduces the effect, I found the source of trouble.
The conditional part locks both mutexes, while it uses just one for the std::condition_variable::wait()
function.
But I still wonder, what going on behind the scene, that produces such a high CPU load.
On my machine, the following code prints out 10 times a second and consumes almost 0 cpu because most of the time the thread is either sleeping or blocked on a locked mutex:
#include <chrono>
#include <thread>
#include <mutex>
#include <iostream>
using namespace std::chrono_literals;
std::mutex m1;
std::mutex m2;
void
f1()
{
while (true)
{
std::unique_lock<std::mutex> l1(m1, std::defer_lock);
std::unique_lock<std::mutex> l2(m2, std::defer_lock);
std::lock(l1, l2);
std::cout << "f1 has the two locks\n";
std::this_thread::sleep_for(100ms);
}
}
void
f2()
{
while (true)
{
std::unique_lock<std::mutex> l2(m2, std::defer_lock);
std::unique_lock<std::mutex> l1(m1, std::defer_lock);
std::lock(l2, l1);
std::cout << "f2 has the two locks\n";
std::this_thread::sleep_for(100ms);
}
}
int main()
{
std::thread t1(f1);
std::thread t2(f2);
t1.join();
t2.join();
}
Sample output:
f1 has the two locks
f2 has the two locks
f1 has the two locks
...
I'm running this on OS X and the Activity Monitor application says that this process is using 0.1% cpu. The machine is a Intel Core i5 (4 core).
I'm happy to adjust this experiment in any way to attempt to create live-lock or excessive cpu usage.
Update
If this program is using excessive CPU on your platform, try changing it to call ::lock()
instead, where that is defined with:
template <class L0, class L1>
void
lock(L0& l0, L1& l1)
{
while (true)
{
{
std::unique_lock<L0> u0(l0);
if (l1.try_lock())
{
u0.release();
break;
}
}
std::this_thread::yield();
{
std::unique_lock<L1> u1(l1);
if (l0.try_lock())
{
u1.release();
break;
}
}
std::this_thread::yield();
}
}
I would be interested to know if that made any difference for you, thanks.
Update 2
After a long delay, I have written a first draft of a paper on this subject. The paper compares 4 different ways of getting this job done. It contains software you can copy and paste into your own code and test yourself (and please report back what you find!):
http://howardhinnant.github.io/dining_philosophers.html
As the documentation says, [t]he objects are locked by an unspecified series of calls to lock, try_lock, unlock. There is simply no way that can possibly be efficient if the mutexes are held by other threads for a significant period of time. There's no way the function can wait without spinning.
The std::lock()
non-member function may cause Live-lock problem or performance degradation, it guarantee only "Never Dead-lock".
If you can determine "Lock order(Lock hierarchy)" of multiple mutexes by design, it's preferable to not use generic std::lock()
but lock each mutexes in pre-determinate order.
Refer to Acquiring Multiple Locks Without Deadlock for more detail.