OpenMP and MPI hybrid program

后端 未结 2 1426
慢半拍i
慢半拍i 2020-12-29 12:52

I have a machine with 8 processors. I want to alternate using OpenMP and MPI on my code like this:

OpenMP phase:

  • ranks 1-7 wait on a M
相关标签:
2条回答
  • 2020-12-29 13:35

    Thanks all for the comments and answers. You are all right. It's all about the "PIN" option.

    To solve my problem, I just had to:

    I_MPI_WAIT_MODE=1

    I_MPI_PIN_DOMAIN=omp

    Simple as that. Now all processors are available to all ranks.

    The option

    I_MPI_DEBUG=4

    shows which processors each rank gets.

    0 讨论(0)
  • 2020-12-29 13:36

    The following code shows an example on how to save the CPU affinity mask before the OpenMP part, alter it to allow all CPUs for the duration of the parallel region and then restore the previous CPU affinity mask. The code is Linux specific and it makes no sense if you do not enable process pinning by the MPI library - activated by passing --bind-to-core or --bind-to-socket to mpiexec in Open MPI; deactivated by setting I_MPI_PIN to disable in Intel MPI (the default on 4.x is to pin processes).

    #define _GNU_SOURCE
    
    #include <sched.h>
    
    ...
    
    cpu_set_t *oldmask, *mask;
    size_t size;
    int nrcpus = 256; // 256 cores should be more than enough
    int i;
    
    // Save the old affinity mask
    oldmask = CPU_ALLOC(nrcpus);
    size = CPU_ALLOC_SIZE(nrcpus);
    CPU_ZERO_S(size, oldmask);
    if (sched_getaffinity(0, size, oldmask) == -1) { error }
    
    // Temporary allow running on all processors
    mask = CPU_ALLOC(nrcpus);
    for (i = 0; i < nrcpus; i++)
       CPU_SET_S(i, size, mask);
    if (sched_setaffinity(0, size, mask) == -1) { error }
    
    #pragma omp parallel
    {
    }
    
    CPU_FREE(mask);
    
    // Restore the saved affinity mask
    if (sched_setaffinity(0, size, oldmask) == -1) { error }
    
    CPU_FREE(oldmask);
    
    ...
    

    You can also tweak the pinning arguments of the OpenMP run-time. For GCC/libgomp the affinity is controlled by the GOMP_CPU_AFFINITY environment variable, while for Intel compilers it is KMP_AFFINITY. You can still use the code above if the OpenMP run-time intersects the supplied affinity mask with that of the process.

    Just for the sake of completeness - saving, setting and restoring the affinity mask on Windows:

    #include <windows.h>
    
    ...
    
    HANDLE hCurrentProc, hDupCurrentProc;
    DWORD_PTR dwpSysAffinityMask, dwpProcAffinityMask;
    
    // Obtain a usable handle of the current process
    hCurrentProc = GetCurrentProcess();
    DuplicateHandle(hCurrentProc, hCurrentProc, hCurrentProc,
                    &hDupCurrentProc, 0, FALSE, DUPLICATE_SAME_ACCESS);
    
    // Get the old affinity mask
    GetProcessAffinityMask(hDupCurrentProc,
                           &dwpProcAffinityMask, &dwpSysAffinityMask);
    
    // Temporary allow running on all CPUs in the system affinity mask
    SetProcessAffinityMask(hDupCurrentProc, &dwpSysAffinityMask);
    
    #pragma omp parallel
    {
    }
    
    // Restore the old affinity mask
    SetProcessAffinityMask(hDupCurrentProc, &dwpProcAffinityMask);
    
    CloseHandle(hDupCurrentProc);
    
    ...
    

    Should work with a single processor group (up to 64 logical processors).

    0 讨论(0)
提交回复
热议问题