Linux Loopback performance with TCP_NODELAY enabled

前端 未结 3 1551
孤独总比滥情好
孤独总比滥情好 2021-01-31 21:04

I recently stumbled on an interesting TCP performance issue while running some performance tests that compared network performance versus loopback performance. In my case the n

相关标签:
3条回答
  • 2021-01-31 21:17

    1) In what cases, and why, would communicating over loopback be slower than over the network?

    Loopback puts the packet setup+tcp chksum calculation for both tx+rx on the same machine, so it needs to do 2x as much processing, while with 2 machines you split the tx/rx between them. This can have negative impact on loopback.

    2) When sending as fast as possible, why does toggling TCP_NODELAY have so much more of an impact on maximum throughput over loopback than over the network?

    Not sure how you've come to this conclusion, but the loopback vs network are implemented very differently, and if you try to push them to the limit, you will hit different issues. Loopback interfaces (as mentioned in answer to 1) cause tx+rx processing overhead on the same machine. On the other hand, NICs have a # of limits in terms of how many outstanding packets they can have in their circular buffers etc which will cause completely different bottlenecks (and this varies greatly from chip to chip too, and even from the switch that's between them)

    3) How can we detect and analyze TCP congestion control as a potential explanation for the poor performance?

    Congestion control only kicks in if there is packet loss. Are you seeing packet loss? Otherwise, you're probably hitting limits on the tcp window size vs network latency factors.

    4) Does anyone have any other theories as to the reason for this phenomenon? If yes, any method to prove the theory?

    I don't understand the phenomenon you refer to here. All I see in your table is that you have some sockets with a large send buffer - this can be perfectly legitimate. On a fast machine, your application will certainly be capable of generating more data than the network can pump out, so I'm not sure what you're classifying as a problem here.

    One final note: small messages create a much bigger performance hit on your network for various reasons, such as:

    • there is a fixed per packet overhead (for mac+ip+tcp headers), and the smaller the payload is, the more overhead you're going to have.
    • many NIC limitations are relative to the # of outstanding packets, which means you'll hit NIC bottlenecks with much less data when using smaller packets.
    • the network itself as per-packet overhead, so the max amount of data you can pump through the network is dependent on the size of the packets again.
    0 讨论(0)
  • 2021-01-31 21:25

    The is the same issue I faced,also. When transferring 2 MB of data between two components running in the same RHEL6 machine, it took 7 seconds to complete. When the data size is large, the time is not acceptable. It took 1 min to transfer 10 MB of data.

    Then I have tried with TCP_NODELAY disabled. It solved the problem

    This does not happen when the two components are in two different machines.

    0 讨论(0)
  • 2021-01-31 21:28

    1 or 2) I'm not sure why you're bothering to use loopback at all, I personally don't know how closely it will mimic a real interface and how valid it will be. I know that Microsoft disables NAGLE for the loopback interface (if you care). Take a look at this link, there's a discussion about this.

    3) I would closely look at the first few packets in both cases and see if you're getting a severe delay in the first five packets. See here

    0 讨论(0)
提交回复
热议问题