For a TCP client connect() call to a TCP server..
If the client TCP receives no response to its SYN segment, ETIMEDOUT is returned. 4.4BSD, for example, sends one SYN when connect is called, another 6 seconds later, and another 24 seconds later (p. 828 of TCPv2). If no response is received after a total of 75 seconds, the error is returned.
In Linux I would like know what is the retry mechanism (how many times and how far apart). Asking because for a TCP client connect() call I am getting ETIMEDOUT error. This socket has O_NONBLOCK option and monitored by epoll() for the events.
If someone can point to me where in the code this retry logic is implemented that would be helpful too. I tried following a bit starting with tcp_v4_connect() from net/ipv4/tcp_ipv4.c, but lost my way pretty soon..
The timeout is scaled based on the measured round-trip time.
tcp_connect()
sets up a timer:
/* Timer for repeating the SYN until an answer. */ inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, inet_csk(sk)->icsk_rto, TCP_RTO_MAX);
The icsk_rto
will use a per-destination re-transmission timeout; if previous metrics from the destination is available from previous connections, it is re-used. (See the tcp_no_metrics_save
discussion in tcp(7)
for details.) If no metrics are saved, then the kernel will fall back to a default RTO value:
#define TCP_RTO_MAX ((unsigned)(120*HZ)) #define TCP_RTO_MIN ((unsigned)(HZ/5)) #define TCP_TIMEOUT_INIT ((unsigned)(1*HZ)) /* RFC2988bis initial RTO value */ #define TCP_TIMEOUT_FALLBACK ((unsigned)(3*HZ)) /* RFC 1122 initial RTO value, now * used as a fallback RTO for the * initial data transmission if no * valid RTT sample has been acquired, * most likely due to retrans in 3WHS. */
tcp_retransmit_timer()
has some code near the bottom for recalculating the delay:
inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, icsk->icsk_rto, TCP_RTO_MAX); if (retransmits_timed_out(sk, sysctl_tcp_retries1 + 1, 0, 0)) __sk_dst_reset(sk);
retransmits_timed_out()
will first perform a linear backoff then an exponential backoff.
I think the long and the short of it is that you can reasonably expect roughly 120 seconds before getting ETIMEDOUT
error returns from connect(2)
unless the kernel has good reason to suspect that the remote peer should have replied sooner.
A typical reason for ETIMEOUT is a firewall which simply swallows the packets instead of replying with ICMP Destination Unreachable.
This is a common setup to prevent hackers from probing a network for hosts.