RDMA program randomly hangs
Anyone out there who has done RDMA programming using the RDMA_CM library? I'm having a hard time finding even simple examples to study. There's an rdma_client & rdma_server example in librdmacm, but it doesn't run in a loop (rping does loop, but it's written using IB verbs directly instead of rdma_cm functions). I've put together a trivial ping-pong program, but it locks up anywhere after 1 - 100 bounces. I found adding a sleep inside the client makes it work longer before hanging, which indicates a race condition. The client gets stuck in rdma_get_send_comp() and the server gets stuck in rdma