intel-mpi

why does mpirun behave as it does when used with slurm?

别说谁变了你拦得住时间么 提交于 2019-12-25 01:45:42
问题 I am using Intel MPI and have encountered some confusing behavior when using mpirun in conjunction with slurm. If I run (in a login node) mpirun -n 2 python -c "from mpi4py import MPI; print(MPI.COMM_WORLD.Get_rank())" then I get as output the expected 0 and 1 printed out. If however I salloc --time=30 --nodes=1 and run the same mpirun from the interactive compute node, I get two 0s printed out instead of the expected 0 and 1. Then, if I change -n 2 to -n 3 (still in compute node), I get a

MPI message received in different communicator - erroneous program or MPI implementation bug?

时光总嘲笑我的痴心妄想 提交于 2019-12-24 06:35:36
问题 This is a follow-up to this previous question of mine, for which the conclusion was that the program was erroneous, and therefore the expected behavior was undefined. What I'm trying to create here is a simple error-handling mechanism, for which I use that Irecv request for the empty message as an " abort handle ", attaching it to my normal MPI_Wait call (and turning it into MPI_WaitAny ), in order to allow me to unblock process 1 in case an error occurs on process 0 and it can no longer

MPI message received in different communicator

自闭症网瘾萝莉.ら 提交于 2019-12-24 03:25:24
问题 It was my understanding that MPI communicators restrict the scope of communication, such that messages sent from one communicator should never be received in a different one. However, the program inlined below appears to contradict this. I understand that the MPI_Send call returns before a matching receive is posted because of the internal buffering it does under the hood (as opposed to MPI_Ssend ). I also understand that MPI_Comm_free doesn't destroy the communicator right away, but merely

MPI_SEND takes huge part of virtual memory

北慕城南 提交于 2019-12-18 16:45:26
问题 Debugging my program on big counts of kernels, I faced with very strange error of insufficient virtual memory . My investigations lead to peace of code, where master sends small messages to each slave. Then I wrote small program, where 1 master simply send 10 integers with MPI_SEND and all slaves receives it with MPI_RECV . Comparison of files /proc/self/status before and after MPI_SEND showed, that difference between memory sizes is huge! The most interesting thing (which crashes my program)

Intel MPI mpirun does not terminate using java Process.destroy()

别说谁变了你拦得住时间么 提交于 2019-12-11 05:29:10
问题 My Intel MPI version is impi/5.0.2.044/intel64 installed on a RHEL machine. I am using java to invoke an MPI program using the following code: ProcessBuilder builder = new ProcessBuilder(); builder.command("mpirun ./myProgram"); builder.redirectError(Redirect.to(new File("stderr"))); builder.redirectOutput(Redirect.to(new File("stdout"))); Process p = null; try { p = builder.start(); } catch (IOException e) { e.printStackTrace(); } // Process has started here p.destroy(); try { // i = 143 int

Prevent MPI from busy looping

人走茶凉 提交于 2019-12-04 00:59:23
问题 I have an MPI program which oversubscribes/overcommits its processors. That is: there are many more processes than processors. Only a few of these processes are active at a given time, though, so there shouldn't be contention for computational resources. But, much like the flock of seagulls from Finding Nemo , when those processes are waiting for communication they're all busy-looping, asking "Mine? Mine? Mine?" I am using both Intel MPI and OpenMPI (for different machines). How can I

MPI_SEND takes huge part of virtual memory

≡放荡痞女 提交于 2019-11-30 13:54:46
Debugging my program on big counts of kernels, I faced with very strange error of insufficient virtual memory . My investigations lead to peace of code, where master sends small messages to each slave. Then I wrote small program, where 1 master simply send 10 integers with MPI_SEND and all slaves receives it with MPI_RECV . Comparison of files /proc/self/status before and after MPI_SEND showed, that difference between memory sizes is huge! The most interesting thing (which crashes my program), is that this memory won't deallocate after MPI_Send and still take huge space. Any ideas? System