问题
I am new to using Microsoft Azure for scientific computing purposes and have encountered a few issues whilst setting up.
I have a jump box set-up that acts as a license server for the software that I whish to use, is also has a common drive to store all of the software. 6 compute nodes are also set-up (16 core/node) and I can 'ssh' from the jump box to the compute nodes without issue. The jump box and compute nodes are using CentOS with OpenMPI 1.10.3
I have created a script that is stored on the mounted jump box drive that I run on each compute node through 'clusRun.sh' which sets up all the environment variable specific to the software I run and OpenMPI. Hopefully it all sounds good to this point.
I've used this software on Linux clusters a lot in the past without issue. The jobs are submitted using a command similar such as:
mpirun -np XXX -hostfile XXX {path to software}
Where XXX is the number of processors and path to hostfile
I run this command on the jump box and the hostfile has a list of the names of each compute node, each compute node name is in the hostfile the same number of times as cores I want on the node. Hope that makes sense! There are no processes from the job running on the jump box node, it's merely used to launch the job.
When I try and run the jobs this way, I receive a number of errors, most seem to be tied up with Infiniband. Here is a short list of the key errors:
"The /dev/hfi1_0 device failed to appear after 15.0 seconds: Connection timed out"
"The OpenFabrics (openib) BTL failed to initialize while trying to create an internal queue"
"OMPI source: btl_openib.c:324
Function: ibv_create_srq()
Error: Function not implemented (errno=38)
Device: mlx4_0"
"At least one pair of MPI processes are unable to reach each other for MPI communications. This means that no Open MPI device has indicated that it can be used to communicate between these processes"
Are there any environment variables specific to OpenMPI that need to be set-up that define any Infiniband settings? I have already defined the usual MPI_BIN, LD_LIBRARY_PATH, PATH etc. I know that IntelMPI requires additional variables.
The Infiniband should come as part of the A9 HPC allocation, however I'm not sure if it need any specific setting up. When I run 'ifconfig -a' there are no Infiniband specific entries (I expect to see ib0, ib1 etc). I just have eth0, eth1 and lo
I look forward to any advise that someone might be able to offer.
Kind regards!
回答1:
As stated in "Repository containing the Articles on azure.microsoft.com Documentation Center" by daltskin (forked from deleted/hidden Azure/azure-content-internal), at page https://github.com/daltskin/azure-content/blob/master/articles/virtual-machines/virtual-machines-a8-a9-a10-a11-specs.md#access-to-the-rdma-network "About the A8, A9, A10, and A11 compute-intensive instances" - "Access from Linux A8 and A9 VMs"
At this time, Azure Linux RDMA is supported only with Intel MPI Library 5.
So, CentOS with OpenMPI 1.10.3 probably will not work with this virtualized RDMA by Azure, as OpenMPI 1.10.3 is not the "Intel MPI Library 5".
In official docs Azure lists Intel MPI as RDMA-enabled too (with SLES 12 SP1 HPC VM): https://docs.microsoft.com/en-us/azure/virtual-machines/linux/classic/rdma-cluster "Set up a Linux RDMA cluster to run MPI applications" - 2017-3-14
Customize the VM
In a SLES 12 SP1 HPC VM, we recommend that you don't apply kernel updates, which can cause issues with the Linux RDMA drivers. Intel MPI: Complete the installation of Intel MPI on the SLES 12 SP1 HPC VM by running the following command:
sudo rpm -v -i --nodeps /opt/intelMPI/intel_mpi_packages/*.rpm
If you want to set up a cluster based on one of the CentOS-based HPC images in the Azure Marketplace instead of SLES 12 for HPC, follow the general steps in the preceding section. Note the following differences when you provision and configure the VM: Intel MPI is already installed on a VM provisioned from a CentOS-based HPC image.
So, there is proprietary kernel driver of Azure virtual RDMA (Infiniband), preinstalled into SLES 12 VM image from Azure and into CentOS VM image from Azure, and also proprietary user-space driver (as Infiniband commonly uses kernel bypass and talk to hw from user-space for data movement operations) only in Intel MPI.
Try to recompile your application with preinstalled Intel MPI and start it with Intel MPI's mpirun/mpiexec. Instruction is still on the same https://docs.microsoft.com/en-us/azure/virtual-machines/linux/classic/rdma-cluster:
Configure Intel MPI To run MPI applications on Azure Linux RDMA, you need to configure certain environment variables specific to Intel MPI. Here is a sample Bash script to configure the variables needed to run an application. Change the path to
mpivars.sh
as needed for your installation of Intel MPI.
#!/bin/bash -x # For a SLES 12 SP1 HPC cluster source /opt/intel/impi/5.0.3.048/bin64/mpivars.sh # For a CentOS-based HPC cluster # source /opt/intel/impi/5.1.3.181/bin64/mpivars.sh export I_MPI_FABRICS=shm:dapl # THIS IS A MANDATORY ENVIRONMENT VARIABLE AND MUST BE SET BEFORE RUNNING ANY JOB # Setting the variable to shm:dapl gives best performance for some applications # If your application doesn’t take advantage of shared memory and MPI together, then set only dapl export I_MPI_DAPL_PROVIDER=ofa-v2-ib0 # THIS IS A MANDATORY ENVIRONMENT VARIABLE AND MUST BE SET BEFORE RUNNING ANY JOB export I_MPI_DYNAMIC_CONNECTION=0 # THIS IS A MANDATORY ENVIRONMENT VARIABLE AND MUST BE SET BEFORE RUNNING ANY JOB # Command line to run the job mpirun -n <number-of-cores> -ppn <core-per-node> -hostfile <hostfilename> /path <path to the application exe> <arguments specific to the application> #end
来源:https://stackoverflow.com/questions/43669464/how-to-azure-openmpi-with-infiniband-linux