I\'ve been struggling a weird problem the last few days. We create some libraries using GCC 4.8 which link some of their dependencies statically - eg. log4cplus or boost. For th
OpenMP is an intermediary between your code and its execution. Each #pragma omp
statement are converted to calls to their according OpenMP library function, and it's all there is to it. The multithreaded execution (launching threads, joining and synchronizing them, etc.) is always handled by the Operating System (OS). All OpenMP does is handling these low-level OS-dependent threading calls for us portably in a short and sweet interface.
The -fopenmp
flag is a high-level one that does more than include GCC's OpenMP implementation (gomp). This gomp library will require more libraries to access the threading functionality of the OS. On POSIX-compliant OSes, OpenMP is usually based on pthread, which needs to be linked. It may also need the realtime extension library (librt) to work on some OSes, while not on some other. When using dynamic linking, everything should be discovered automatically, but when you specified -static
, I think you've fallen in the situation described by Jakub Jelinek here. But nowadays, pthread (and rt if needed) should be automatically linked when -static
is used.
Aside from linking dependencies, the -fopenmp
flag also activates some pragma statement processing. You can see throughout the GCC code (as here and here) that without the -fopenmp
flag (which isn't trigged by only linking the gomp library), multiple pragmas won't be converted to the appropriate OpenMP function call. I just tried with some example code, and both -lgomp
and -fopenmp
produce a working executable that links against the same libraries. The only difference in my simple example that the -fopenmp
has a symbol that the -lgomp
doesn't have: GOMP_parallel@@GOMP_4.0+
(code here) which is the function that initializes the parallel section performing the forks requested by the #pragma omp parallel
in my example code. Thus, the -lgomp
version did not translate the pragma to a call to GCC's OpenMP implementation. Both produced a working executable, but only the -fopenmp
flag produced a parallel executable in this case.
To wrap up, -fopenmp
is needed for GCC to process all the OpenMP pragmas. Without it, your parallel sections won't fork any thread, which could wreak havoc depending on the assumptions on which your inner code was done.