Why is this C++11 code containing rand() slower with multiple threads than with one?

前端未结

关注

 4  1801

栀梦 2020-12-13 06:17

I\'m trying around on the new C++11 threads, but my simple test has abysmal multicore performance. As a simple example, this program adds up some squared random numbers.

4条回答

有刺的猬 (楼主)

2020-12-13 06:39
To make this faster, use a thread pool pattern.

This will let you enqueue tasks in other threads without the overhead of creating a std::thread each time you want to use more than one thread.

Don't count the overhead of setting up the queue in your performance metrics, just the time to enqueue and extract the results.

Create a set of threads and a queue of tasks (a structure containing a std::function) to feed them. The threads wait on the queue for new tasks to do, do them, then wait on new tasks.

The tasks are responsible for communicating their "done-ness" back to the calling context, such as via a std::future<>. The code that lets you enqueue functions into the task queue might do this wrapping for you, ie this signature:
```
template
std::future enqueue( std::function f ) {
  std::packaged_task task(f);
  std::future retval = task.get_future();
  this->add_to_queue( std::move( task ) ); // if we had move semantics, could be easier
  return retval;
}
```
which turns a naked std::function returning R into a nullary packaged_task, then adds that to the tasks queue. Note that the tasks queue needs be move-aware, because packaged_task is move-only.

Note 1: I am not all that familiar with std::future, so the above could be in error.

Note 2: If tasks put into the above described queue are dependent on each other for intermediate results, the queue could deadlock, because no provision to "reclaim" threads that are blocked and execute new code is described. However, "naked computation" non-blocking tasks should work fine with the above model.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...