infiniband

RDMA program randomly hangs

微笑、不失礼 提交于 2019-12-04 16:59:48
Anyone out there who has done RDMA programming using the RDMA_CM library? I'm having a hard time finding even simple examples to study. There's an rdma_client & rdma_server example in librdmacm, but it doesn't run in a loop (rping does loop, but it's written using IB verbs directly instead of rdma_cm functions). I've put together a trivial ping-pong program, but it locks up anywhere after 1 - 100 bounces. I found adding a sleep inside the client makes it work longer before hanging, which indicates a race condition. The client gets stuck in rdma_get_send_comp() and the server gets stuck in rdma

Infiniband addressing - host names to IB address without IBoIP

随声附和 提交于 2019-12-04 06:43:42
I've just started getting familiar with infiniband and I'm wanting to understand the methods you can use to address the infiniband nodes. Based on the code is the example from: RDMA read and write with IB verbs I can address individual nodes by IP or hostname using IPoIB. Another way is to use a port GUID address directly. But it looks like you'd have to look those up and is more similar to ethernet mac addressing. Then then is something called an LID address, a 16bit local address assigned by the fabric manager. How do I use and determine at runtime an LID address? for example, I run ibaddr

Java Sockets on RDMA (JSOR) vs jVerbs performance in Infiniband

本小妞迷上赌 提交于 2019-11-30 16:51:46
I have basic understanding of both JSOR and jVerbs. Both handle limitations of JNI and use fast path to reduce latency. Both of them use user Verbs RDMA interface for avoiding context switch and providing fast path access. Both also have options for zero-copy transfer. The difference is that JSOR still uses the Java Socket interface. jVerbs provides a new interface. jVerbs also has something called Stateful Verbs Call to avoid repeat serialization of RDMA requests which they say reduces latency. jVerbs provides a more native interface and applications can directly use these. I read the jVerbs

MPI_SEND takes huge part of virtual memory

≡放荡痞女 提交于 2019-11-30 13:54:46
Debugging my program on big counts of kernels, I faced with very strange error of insufficient virtual memory . My investigations lead to peace of code, where master sends small messages to each slave. Then I wrote small program, where 1 master simply send 10 integers with MPI_SEND and all slaves receives it with MPI_RECV . Comparison of files /proc/self/status before and after MPI_SEND showed, that difference between memory sizes is huge! The most interesting thing (which crashes my program), is that this memory won't deallocate after MPI_Send and still take huge space. Any ideas? System

How to use GPUDirect RDMA with Infiniband

我只是一个虾纸丫 提交于 2019-11-29 11:35:01
I have two machines. There are multiple Tesla cards on each machine. There is also an InfiniBand card on each machine. I want to communicate between GPU cards on different machines through InfiniBand. Just point to point unicast would be fine. I surely want to use GPUDirect RDMA so I could spare myself of extra copy operations. I am aware that there is a driver available now from Mellanox for its InfiniBand cards. But it doesn't offer a detailed development guide. Also I am aware that OpenMPI has support for the feature I am asking. But OpenMPI is too heavy weight for this simple task and it