Using the OpenMP threadprivate directive on static instances of C++ STL types

前端 未结 3 1637
星月不相逢
星月不相逢 2021-01-18 09:22

Consider the following snippet:

#include 

class A {
    static std::map theMap;
#pragma omp threadprivate(theMap)
};

std::map<         


        
3条回答
  •  一生所求
    2021-01-18 10:15

    This is a compiler restriction. Intel C/C++ compiler supports C++ classes on threadprivate while gcc and MSVC currently cannot.

    For example, in MSVC (VS 2010), you will get this error (I removed the class):

    static std::map theMap;
    #pragma omp threadprivate(theMap)
    
    error C3057: 'theMap' : dynamic initialization of 'threadprivate' symbols is not currently supported
    

    So, the workaround is pretty obvious, but dirty. You need to make a very simple thread-local storage. A simple approach would be:

    const static int MAX_THREAD = 64;
    
    struct MY_TLS_ITEM
    {
      std::map theMap;
      char padding[64 - sizeof(theMap)];
    };
    
    __declspec(align(64)) MY_TLS_ITEM tls[MAX_THREAD];
    

    Note that the reason why I have padding is to avoid false sharing. I assume that 64-byte cache line for modern Intel x86 processors. __declspec(align(64)) is a MSVC extension that the structure is on the boundary of 64. So, any elements in tls will be located on a different cache line, resulting in no false sharing. GCC has __attribute__ ((aligned(64))).

    In order to access this simple TLS, you can do this:

    tls[omp_get_thread_num()].theMap;

    Of course, you should call this inside one of OpenMP parallel constructs. The nice thing is that OpenMP provides an abstracted thread ID in [0, N), where N is the maximum thread number. This enables a fast and simple TLS implementation. In general, a native TID from operating system is an arbitrary integer number. So, you mostly need to have a hash table whose access time is longer than a simple array.

提交回复
热议问题