问题
http server:
package main
import (
"log"
"net/http"
)
func main() {
// 启动HTTP服务
addr := "127.0.0.1:8080"
// 添加agent的websocket处理
http.HandleFunc("/agent", agentHandler)
err := http.ListenAndServe(addr, nil)
log.Fatal(err)
}
func agentHandler(w http.ResponseWriter, r *http.Request) {
}
http client用telnet模拟(不涉及http消息的收发,只是连接)
telnet 127.0.0.1 8080
用tcpdump和strace发现server端开启了15秒的tcp keepalive,但是看代码只有默认为30秒的设置,很奇怪:
$ strace ./s 2>&1 | grep setsockopt
setsockopt(3, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0
setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
// 上面是客户端连接前输出的,下面是客户端连接后输出的
setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
setsockopt(4, SOL_TCP, TCP_KEEPINTVL, [15], 4) = 0
setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [15], 4) = 0
分析过程
先查看本机的go源码目录:
$ echo `go env GOROOT`/src
/usr/lib/go/src
通过查看源码发现,tcp keepalive最终是在tcpsockopt_unix.go
设置的:
func setKeepAlivePeriod(fd *netFD, d time.Duration) error {
// The kernel expects seconds so round to next highest second.
d += (time.Second - time.Nanosecond)
secs := int(d.Seconds())
if err := fd.pfd.SetsockoptInt(syscall.IPPROTO_TCP, syscall.TCP_KEEPINTVL, secs); err != nil {
return wrapSyscallError("setsockopt", err)
}
err := fd.pfd.SetsockoptInt(syscall.IPPROTO_TCP, syscall.TCP_KEEPIDLE, secs)
runtime.KeepAlive(fd)
return wrapSyscallError("setsockopt", err)
}
所以在这里打断点:
$ gdb ./s
// 省略部分内容
(gdb) b tcpsockopt_unix.go:16
Breakpoint 1 at 0x55c0ae: file /usr/lib/go/src/net/tcpsockopt_unix.go, line 17.
(gdb) r
Starting program: /home/chuqq/work/work/temp/codeeveryday/golang/20200213_gorilla_websocket/s
// 省略部分内容
Thread 1 "s" hit Breakpoint 1, net.setKeepAlivePeriod (fd=0xc0000ea080, d=15000000000, ~r2=...)
at /usr/lib/go/src/net/tcpsockopt_unix.go:17
17 d += (time.Second - time.Nanosecond)
(gdb) bt
#0 net.setKeepAlivePeriod (fd=0xc0000ea080, d=15000000000, ~r2=...) at /usr/lib/go/src/net/tcpsockopt_unix.go:17
#1 0x000000000055bd75 in net.(*TCPListener).accept (ln=0xc00000e260, ~r0=<optimized out>, ~r1=...)
at /usr/lib/go/src/net/tcpsock_posix.go:150
#2 0x000000000055aa57 in net.(*TCPListener).Accept (l=0xc00000e260, ~r0=..., ~r1=...) at /usr/lib/go/src/net/tcpsock.go:261
#3 0x000000000065867c in net/http.(*onceCloseListener).Accept (~r0=..., ~r1=...) at <autogenerated>:1
#4 0x0000000000637a50 in net/http.(*Server).Serve (srv=0xc0000e8000, l=..., ~r1=...) at /usr/lib/go/src/net/http/server.go:2896
#5 0x0000000000637777 in net/http.(*Server).ListenAndServe (srv=0xc0000e8000, ~r0=...) at /usr/lib/go/src/net/http/server.go:2825
#6 0x00000000006609b6 in net/http.ListenAndServe (handler=..., addr=...) at /usr/lib/go/src/net/http/server.go:3081
#7 main.main () at /home/chuqq/temp/codeeveryday/golang/20200213_gorilla_websocket/s.go:13
发现是在/usr/lib/go/src/net/tcpsock_posix.go:150
中的accept()
设置的:
func (ln *TCPListener) accept() (*TCPConn, error) {
fd, err := ln.fd.accept()
if err != nil {
return nil, err
}
tc := newTCPConn(fd)
if ln.lc.KeepAlive >= 0 {
setKeepAlive(fd, true)
ka := ln.lc.KeepAlive
if ln.lc.KeepAlive == 0 {
ka = defaultTCPKeepAlive
}
setKeepAlivePeriod(fd, ka)
}
return tc, nil
}
defaultTCPKeepAlive
是在dial.go
中定义的:
// defaultTCPKeepAlive is a default constant value for TCPKeepAlive times
// See golang.org/issue/31510
const (
defaultTCPKeepAlive = 15 * time.Second
)
我用的版本是1.13.7
版本,貌似是和我之前看的1.11.x
的代码有不小差异,具体可以看上面提到的这个问题:
golang.org/issue/31510
来源:oschina
链接:https://my.oschina.net/chuqq/blog/3165985