Increasing the maximum number of TCP/IP connections in Linux

前端 未结 4 2046
终归单人心
终归单人心 2020-11-22 15:59

I am programming a server and it seems like my number of connections is being limited since my bandwidth isn\'t being saturated even when I\'ve set the number of connections

相关标签:
4条回答
  • 2020-11-22 16:08

    To improve upon the answer given by derobert,

    You can determine what your OS connection limit is by catting nf_conntrack_max.

    For example: cat /proc/sys/net/netfilter/nf_conntrack_max

    You can use the following script to count the number of tcp connections to a given range of tcp ports. By default 1-65535.

    This will confirm whether or not you are maxing out your OS connection limit.

    Here's the script.

    #!/bin/bash
    OS=$(uname)
    
    case "$OS" in
        'SunOS')
                AWK=/usr/bin/nawk
                ;;
        'Linux')
                AWK=/bin/awk
                ;;
        'AIX')
                AWK=/usr/bin/awk
                ;;
    esac
    
    netstat -an | $AWK -v start=1 -v end=65535 ' $NF ~ /TIME_WAIT|ESTABLISHED/ && $4 !~ /127\.0\.0\.1/ {
        if ($1 ~ /\./)
                {sip=$1}
        else {sip=$4}
    
        if ( sip ~ /:/ )
                {d=2}
        else {d=5}
    
        split( sip, a, /:|\./ )
    
        if ( a[d] >= start && a[d] <= end ) {
                ++connections;
                }
        }
        END {print connections}'
    
    0 讨论(0)
  • 2020-11-22 16:09

    Maximum number of connections are impacted by certain limits on both client & server sides, albeit a little differently.

    On the client side: Increase the ephermal port range, and decrease the tcp_fin_timeout

    To find out the default values:

    sysctl net.ipv4.ip_local_port_range
    sysctl net.ipv4.tcp_fin_timeout
    

    The ephermal port range defines the maximum number of outbound sockets a host can create from a particular I.P. address. The fin_timeout defines the minimum time these sockets will stay in TIME_WAIT state (unusable after being used once). Usual system defaults are:

    • net.ipv4.ip_local_port_range = 32768 61000
    • net.ipv4.tcp_fin_timeout = 60

    This basically means your system cannot consistently guarantee more than (61000 - 32768) / 60 = 470 sockets per second. If you are not happy with that, you could begin with increasing the port_range. Setting the range to 15000 61000 is pretty common these days. You could further increase the availability by decreasing the fin_timeout. Suppose you do both, you should see over 1500 outbound connections per second, more readily.

    To change the values:

    sysctl net.ipv4.ip_local_port_range="15000 61000"
    sysctl net.ipv4.tcp_fin_timeout=30
    

    The above should not be interpreted as the factors impacting system capability for making outbound connections per second. But rather these factors affect system's ability to handle concurrent connections in a sustainable manner for large periods of "activity."

    Default Sysctl values on a typical Linux box for tcp_tw_recycle & tcp_tw_reuse would be

    net.ipv4.tcp_tw_recycle=0
    net.ipv4.tcp_tw_reuse=0
    

    These do not allow a connection from a "used" socket (in wait state) and force the sockets to last the complete time_wait cycle. I recommend setting:

    sysctl net.ipv4.tcp_tw_recycle=1
    sysctl net.ipv4.tcp_tw_reuse=1 
    

    This allows fast cycling of sockets in time_wait state and re-using them. But before you do this change make sure that this does not conflict with the protocols that you would use for the application that needs these sockets. Make sure to read post "Coping with the TCP TIME-WAIT" from Vincent Bernat to understand the implications. The net.ipv4.tcp_tw_recycle option is quite problematic for public-facing servers as it won’t handle connections from two different computers behind the same NAT device, which is a problem hard to detect and waiting to bite you. Note that net.ipv4.tcp_tw_recycle has been removed from Linux 4.12.

    On the Server Side: The net.core.somaxconn value has an important role. It limits the maximum number of requests queued to a listen socket. If you are sure of your server application's capability, bump it up from default 128 to something like 128 to 1024. Now you can take advantage of this increase by modifying the listen backlog variable in your application's listen call, to an equal or higher integer.

    sysctl net.core.somaxconn=1024
    

    txqueuelen parameter of your ethernet cards also have a role to play. Default values are 1000, so bump them up to 5000 or even more if your system can handle it.

    ifconfig eth0 txqueuelen 5000
    echo "/sbin/ifconfig eth0 txqueuelen 5000" >> /etc/rc.local
    

    Similarly bump up the values for net.core.netdev_max_backlog and net.ipv4.tcp_max_syn_backlog. Their default values are 1000 and 1024 respectively.

    sysctl net.core.netdev_max_backlog=2000
    sysctl net.ipv4.tcp_max_syn_backlog=2048
    

    Now remember to start both your client and server side applications by increasing the FD ulimts, in the shell.

    Besides the above one more popular technique used by programmers is to reduce the number of tcp write calls. My own preference is to use a buffer wherein I push the data I wish to send to the client, and then at appropriate points I write out the buffered data into the actual socket. This technique allows me to use large data packets, reduce fragmentation, reduces my CPU utilization both in the user land and at kernel-level.

    0 讨论(0)
  • 2020-11-22 16:18

    In an application level, here are something a developer can do:

    From server side:

    1. Check if load balancer(if you have),works correctly.

    2. Turn slow TCP timeouts into 503 Fast Immediate response, if you load balancer work correctly, it should pick the working resource to serve, and it's better than hanging there with unexpected error massages.

    Eg: If you are using node server, u can use toobusy from npm. Implementation something like:

    var toobusy = require('toobusy');
    app.use(function(req, res, next) {
      if (toobusy()) res.send(503, "I'm busy right now, sorry.");
      else next();
    });
    

    Why 503? Here are some good insights for overload: http://ferd.ca/queues-don-t-fix-overload.html

    We can do some work in client side too:

    1. Try to group calls in batch, reduce the traffic and total requests number b/w client and server.

    2. Try to build a cache mid-layer to handle unnecessary duplicates requests.

    0 讨论(0)
  • 2020-11-22 16:29

    There are a couple of variables to set the max number of connections. Most likely, you're running out of file numbers first. Check ulimit -n. After that, there are settings in /proc, but those default to the tens of thousands.

    More importantly, it sounds like you're doing something wrong. A single TCP connection ought to be able to use all of the bandwidth between two parties; if it isn't:

    • Check if your TCP window setting is large enough. Linux defaults are good for everything except really fast inet link (hundreds of mbps) or fast satellite links. What is your bandwidth*delay product?
    • Check for packet loss using ping with large packets (ping -s 1472 ...)
    • Check for rate limiting. On Linux, this is configured with tc
    • Confirm that the bandwidth you think exists actually exists using e.g., iperf
    • Confirm that your protocol is sane. Remember latency.
    • If this is a gigabit+ LAN, can you use jumbo packets? Are you?

    Possibly I have misunderstood. Maybe you're doing something like Bittorrent, where you need lots of connections. If so, you need to figure out how many connections you're actually using (try netstat or lsof). If that number is substantial, you might:

    • Have a lot of bandwidth, e.g., 100mbps+. In this case, you may actually need to up the ulimit -n. Still, ~1000 connections (default on my system) is quite a few.
    • Have network problems which are slowing down your connections (e.g., packet loss)
    • Have something else slowing you down, e.g., IO bandwidth, especially if you're seeking. Have you checked iostat -x?

    Also, if you are using a consumer-grade NAT router (Linksys, Netgear, DLink, etc.), beware that you may exceed its abilities with thousands of connections.

    I hope this provides some help. You're really asking a networking question.

    0 讨论(0)
提交回复
热议问题