Boost Random and OpenMP

后端 未结 3 985
独厮守ぢ
独厮守ぢ 2021-01-03 04:58

I\'m getting a \"bus error\" from an OpenMP parallel section of code. I recreated a simple version of my problem below. The code essentially makes many calls to the function

相关标签:
3条回答
  • 2021-01-03 05:13

    When you parallelize some code, you must consider the shared resource, which can cause data races, in turn, eventually may break your program. (Note: not all data races will break your program.)

    In your case, as you expected correctly, eng is the shared by two or more threads, which must be avoided for the correct execution.

    A solution for your case is privatization: making a per-thread copy for the shared resources. You need to create a separate copy of eng.

    There are a number of way to do privatization for eng:

    (1) Try to use threadprivate directive (link): For example, #pragma omp threadprivate(eng). However, some compilers may not support non-POD structures for this directive.

    (2) In case where threadprivate is not available, use an array of eng and access with thread id: declare such as eng[MAX_THREAD]. Then, access with thread id: eng[omp_get_thread()].

    However, the second solution needs to consider false sharing, which can severely hurt the performance. It's best to guarantee each item in eng[MAX_THREAD] is allocated on separate cache line boundary, which is typically 64-byte in modern desktop CPUs. There are also several ways to avoid false sharing. The simplest solution would be using padding: e.g., char padding[x] in a struct that holds eng.

    0 讨论(0)
  • 2021-01-03 05:17

    You have two options:

    • have individual random number generators for each thread and seed them differently
    • use mutual exclusion

    First, an example of mutual exclusion:

    # pragma omp parallel for
    for (int bb=0; bb<10000; bb++)
    {
        for (int i=0; i<20000; i++)
        {
            // enter critical region, disallowing simulatneous access to eng
            #pragma omp critical
            {
                int a = uniform_distribution(0,20000);
            }
            // presumably some more code...
        }
        // presumably some more code...
    }
    

    Next, an example of thread-local storage with seeding:

    # pragma omp parallel
    {
        // declare and seed thread-specific generator
        boost::random::mt19937 eng(omp_get_thread_num());
        #pragma omp for
        for (int bb=0; bb<10000; bb++)
        {
            for (int i=0; i<20000; i++)
            {
                int a = uniform_distribution(0,20000, eng);
                // presumably some more code...
            }
            // presumably some more code...
        }
    }
    

    Both of these snippets are just illustrative, depending on your requirements (say security related vs. a game vs. modelling) you may want to pick one over the other. You will probably also want to change the exact implementation to suit your usage. For instance, how you seed the generator is important if you want it to be either repeatable or closer to truly random (whether that's possible is system specific). This applies to both solutions equally (though to get reproducibility in the mutual exclusion case is harder).

    The thread-local generator may run faster while the mutual exclusion case should use less memory.

    EDIT: To be clear, the mutual exclusion solutions only makes sense if the generation of the random numbers is not the bulk of the thread's work (that is // presumably some more code... in the example exists and doesn't take a trivial amount of time to complete). The critical section only needs to encompass the access to the shared variable, changing your architecture a little would allow you finer control over that (and in the thread-local storage case, could also allow you to avoid passing an eng reference around)

    0 讨论(0)
  • 2021-01-03 05:19

    I think the most convenient solution would involve a thread_local RNG and a seeding that involves the thread ID as a unique number for each thread, for example, you can do a XOR between the system time and the thread-id to seed the RNG. Something along the lines of (using C++11):

    #include <omp.h>
    #include <boost/random/uniform_int_distribution.hpp>
    
    #include <thread>
    #include <ctime>
    
    boost::random::mt19937& get_rng_engine() {
      thread_local boost::random::mt19937 eng(
        reinterpret_cast<unsigned int>(std::time(NULL)) ^ std::this_thread::get_id());
      return eng;
    };
    

    (NOTE: you can also use <random> if you are going to use C++11)

    If you can't use C++11 then you can use boost::thread instead to have a similar behavior, see the Boost page on thread-local storage too.

    0 讨论(0)
提交回复
热议问题