How to know the which core a process is running on in MPI?

后端 未结 2 1093
慢半拍i
慢半拍i 2021-01-27 13:39

I am currently working on a project where I need to know the coreid of the processor on which the the process currently runs on in MPI? There is a function in MPI called

相关标签:
2条回答
  • 2021-01-27 14:26

    Your question assumes that each MPI process runs bound to a single CPU core. This is not the default behaviour for many cluster MPI implementations. For example Open MPI has the necessary binding machinery but one has to explicitly enable it with the --bind-to-core or --bind-to-socket option. On the other hand, modern Intel MPI versions enable binding by default for performance reasons. Because of that discrepancy, with most cluster MPI implementations MPI_GET_PROCESSOR_NAME simply returns the hostname of the execution node since no specific processor is identifiable in the general case.

    When each process runs bound to a core, the binding can usually be obtained by reading the affinity mask of the process. This is OS dependent, but there are libraries that can abstract that away, for example the hwloc library (part of Open MPI, but developed as a completely separate project and hence usable on its own). Reading the affinity mask is also possible in the general case - when a process is not bound, the affinity mask would simply match the system affinity mask (i.e. execution allowed on all processors).

    There are platforms where binding is part of the system hardware working, e.g. IBM Blue Gene. There each MPI process executes on one and only well identifiable processor and MPI_Get_processor_name returns a unique string value in each calling process.

    0 讨论(0)
  • 2021-01-27 14:28

    Here is the code which gives the coreids for each process on which they are bound. This needs the hwloc library as suggested by Hristo Iliev in the previous answer's comments.

        #include <stdio.h>
        #include "mpi.h"
        #include <hwloc.h>
    
        int main(int argc, char* argv[])
        {
            int rank, size;
            cpu_set_t mask;
            long num;
            int proc_num(long num);
    
            hwloc_topology_t topology;
            hwloc_cpuset_t cpuset;
            hwloc_obj_t obj;
    
    
            MPI_Init(&argc, &argv);
            MPI_Comm_rank(MPI_COMM_WORLD, &rank);
            MPI_Comm_size(MPI_COMM_WORLD, &size);
    
            hwloc_topology_init ( &topology);
            hwloc_topology_load ( topology);
    
            hwloc_bitmap_t set = hwloc_bitmap_alloc();
            hwloc_obj_t pu;
            int err;
    
            err = hwloc_get_proc_cpubind(topology, getpid(), set, HWLOC_CPUBIND_PROCESS);
            if (err) {
            printf ("Error Cannot find\n"), exit(1);
            }
    
            pu = hwloc_get_pu_obj_by_os_index(topology, hwloc_bitmap_first(set));
            printf ("Hello World, I am %d and pid: %d coreid:%d\n",rank,getpid(),hwloc_bitmap_first(set));
    
            int my_coreid = hwloc_bitmap_first(set);
            int all_coreid[size];
            hwloc_bitmap_free(set);
            hwloc_topology_destroy(topology);
            MPI_Finalize();
            return 0;
    
    }
    
    0 讨论(0)
提交回复
热议问题