I have computer with 4 cores and OMP application with 2 weighty tasks.
int main()
{
#pragma omp parallel sections
{
#pragma omp section
My initial reaction was: You have to declare more parallelism.
You have defined two tasks that can run in parallel. Any attempt by OpenMP to run it on more than two cores will slow you down (because of cache locality and possible false sharing).
Edit If the parallel for loops are of any significant volume (say, not under 8 iterations), and you are not seeing more than 2 cores used, look at
the OMP_NESTED=TRUE|FALSE environment variable
This environment variable enables or disables nested parallelism. The setting of this environment variable can be overridden by calling the
omp_set_nested()
runtime library function.If nested parallelism is disabled, nested parallel regions are serialized and run in the current thread.
In the current implementation, nested parallel regions are always serialized. As a result,
OMP_SET_NESTED
does not have any effect, andomp_get_nested()
always returns 0. If -qsmp=nested_par option is on (only in non-strict OMP mode), nested parallel regions may employ additional threads as available. However, no new team will be created to run nested parallel regions. The default value for OMP_NESTED is FALSE.