Fetch-and-add using OpenMP atomic operations

喜你入骨 提交于 2019-12-04 20:31:45

问题


I’m using OpenMP and need to use the fetch-and-add operation. However, OpenMP doesn’t provide an appropriate directive/call. I’d like to preserve maximum portability, hence I don’t want to rely on compiler intrinsics.

Rather, I’m searching for a way to harness OpenMP’s atomic operations to implement this but I’ve hit a dead end. Can this even be done? N.B., the following code almost does what I want:

#pragma omp atomic
x += a

Almost – but not quite, since I really need the old value of x. fetch_and_add should be defined to produce the same result as the following (only non-locking):

template <typename T>
T fetch_and_add(volatile T& value, T increment) {
    T old;
    #pragma omp critical
    {
        old = value;
        value += increment;
    }
    return old;
}

(An equivalent question could be asked for compare-and-swap but one can be implemented in terms of the other, if I’m not mistaken.)


回答1:


As of openmp 3.1 there is support for capturing atomic updates, you can capture either the old value or the new value. Since we have to bring the value in from memory to increment it anyways, it only makes sense that we should be able to access it from say, a CPU register and put it into a thread-private variable.

There's a nice work-around if you're using gcc (or g++), look up atomic builtins: http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html

It think Intel's C/C++ compiler also has support for this but I haven't tried it.

For now (until openmp 3.1 is implemented), I've used inline wrapper functions in C++ where you can choose which version to use at compile time:

template <class T>
inline T my_fetch_add(T *ptr, T val) {
  #ifdef GCC_EXTENSION
  return __sync_fetch_and_add(ptr, val);
  #endif
  #ifdef OPENMP_3_1
  T t;
  #pragma omp atomic capture
  { t = *ptr; *ptr += val; }
  return t;
  #endif
}

Update: I just tried Intel's C++ compiler, it currently has support for openmp 3.1 (atomic capture is implemented). Intel offers free use of its compilers in linux for non-commercial purposes:

http://software.intel.com/en-us/articles/non-commercial-software-download/

GCC 4.7 will support openmp 3.1, when it eventually is released... hopefully soon :)




回答2:


If you want to get old value of x and a is not changed, use (x-a) as old value:

fetch_and_add(int *x, int a) {
 #pragma omp atomic
 *x += a;

 return (*x-a);
}

UPDATE: it was not really an answer, because x can be modified after atomic by another thread. So it's seems to be impossible to make universal "Fetch-and-add" using OMP Pragmas. As universal I mean operation, which can be easily used from any place of OMP code.

You can use omp_*_lock functions to simulate an atomics:

typedef struct { omp_lock_t lock; int value;} atomic_simulated_t;

fetch_and_add(atomic_simulated_t *x, int a)
{
  int ret;
  omp_set_lock(x->lock);
  x->value +=a;
  ret = x->value;
  omp_unset_lock(x->lock);
}

This is ugly and slow (doing a 2 atomic ops instead of 1). But If you want your code to be very portable, it will be not the fastest in all cases.

You say "as the following (only non-locking)". But what is the difference between "non-locking" operations (using CPU's "LOCK" prefix, or LL/SC or etc) and locking operations (which are implemented itself with several atomic instructions, busy loop for short wait of unlock and OS sleeping for long waits)?



来源:https://stackoverflow.com/questions/4034908/fetch-and-add-using-openmp-atomic-operations

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!