Java Sockets on RDMA (JSOR) vs jVerbs performance in Infiniband

前端 未结 2 615
暗喜
暗喜 2021-01-03 12:00

I have basic understanding of both JSOR and jVerbs.

Both handle limitations of JNI and use fast path to reduce latency. Both of them use user Verbs RDMA interface fo

相关标签:
2条回答
  • 2021-01-03 12:09

    It is a bit hard to compare performance of jVerbs vs JSOR. The first one is message-oriented API, while the second hides RDMA behind stream-based API of Java sockets.

    Here are some stats. My test using a pair of old ConnectX-2 cards and Dell PowerEdge 2970 servers. CentOS 7.1 and Mellanox OFED version 3.1.

    I was only interested in latency test.

    jVerbs

    Test is a variation of RPing sample (can post on github if anybody is interested). Test measured latency of 5000000 cycles of the following sequence of calls over Reliable connection. Message size was 256 bytes.

    PostSendMethod.execute()
    PollCQMethod.execute()
    CompletionChannel.ackCQEvents()
    

    Results (microseconds):

    • Median: 10.885
    • 99.0% percentile: 11.663
    • 99.9% percentile: 17.471
    • 99.99% percentile: 27.791

    JSOR

    Similar test over JSOR socket. Test was a text book client/server socket sample. Message size was 256 bytes as well.

    Results (microseconds):

    • Median: 43
    • 99.0% percentile: 55
    • 99.9% percentile: 61
    • 99.99% percentile: 217

    These results are very far from OFED latency test. On the same hardware/OS standard ib_send_lat benchmark produced 2.77 as median and 23.25 microseconds as maximum latency.

    0 讨论(0)
  • 2021-01-03 12:28

    Here are some numbers using DiSNI -- the newly open sourced successor of IBM's jVerbs -- and DaRPC, the low-latency RPC library using DiSNI.

    • DiSNI RDMA read latencies for 64 bytes are below 2 microseconds
    • DaRPC RDMA send/recv latencies for 64 bytes (request and response) are around 5 microseconds
    • The differences betwenn Java/DiSNI and C native RDMA are negligible for one-sided operations

    These benchmarks have been executed on two hosts connected using a Mellanox ConnectX-3 network interface.

    Here are the commands to execute the benchmarks:

    (1) Read benchmark

    Server:

    java -cp disni-1.0-jar-with-dependencies.jar:disni-1.0-tests.jar com.ibm.disni.examples.benchmarks.AppLauncher -t java-rdma-server -a <address> -o read -s 64 -k 100000 -p
    

    Client:

    java -cp disni-1.0-jar-with-dependencies.jar:disni-1.0-tests.jar com.ibm.disni.examples.benchmarks.AppLauncher -t java-rdma-client -a <address> -o read -s 64 -k 100000 -p
    

    (2) Send/recv benchmark

    Server:

    java -cp darpc-1.0-jar-with-dependencies.jar:darpc-1.0-tests.jar com.ibm.darpc.examples.server.DaRPCServer -a <address> -d -l 64 -r 64 
    

    Client:

    java -cp darpc-1.0-jar-with-dependencies.jar:darpc-1.0-tests.jar com.ibm.darpc.examples.client.DaRPCClient -a <address> -k 1000000 -l 64 -r 64 -b 1
    

    0 讨论(0)
提交回复
热议问题