I have two machines. There are multiple Tesla cards on each machine. There is also an InfiniBand card on each machine. I want to communicate between GPU cards on different machines through InfiniBand. Just point to point unicast would be fine. I surely want to use GPUDirect RDMA so I could spare myself of extra copy operations.
I am aware that there is a driver available now from Mellanox for its InfiniBand cards. But it doesn't offer a detailed development guide. Also I am aware that OpenMPI has support for the feature I am asking. But OpenMPI is too heavy weight for this simple task and it does not support multiple GPUs in a single process.
I wonder if I could get any help with directly using the driver to do the communication. Code sample, tutorial, anything would be good. Also, I would appreciate it if anyone could help me find the code dealing with this in OpenMPI.