TCP_KEEPALIVE选项只是一个开关,Linux中默认的Keepalive的选项如下:
$sudo sysctl -a | grep keepalive net.ipv4.tcp_keepalive_time = 7200 net.ipv4.tcp_keepalive_probes = 9 net.ipv4.tcp_keepalive_intvl = 75
上文中的keepalive选项表示如果一个连接上7200s后没有任何数据发送,则设置了这个选项的本端向对端发送keepalive保活报文,它会有如下三种结果:
- 对端回复ACK。则本端TCP认为该连接依然存活。继续等7200s后再发送keepalive报文。
- 对端回复RESET。说明对端进程已经重启,本端的应用程序应该关闭该连接。
- 没有对端的任何回复。则本端做重试,如果重试9次(前后重试间隔为75秒)仍然不可达,则向应用程序返回错误信息,ETIMEOUT(无任何应答)或EHOST
如果应用程序向改变keepalive的默认行为,该怎么办呢?答案就是利用 TCP_KEEPIDLE、TCP_KEEPINTVL、TCP_KEEPCNT这几个TCP选项,首先看看如何使用:
int setKeepAlive(int fd, int interval) { int val = 1; if (setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &val, sizeof(val)) == -1) { printf("setsockopt SO_KEEPALIVE: %s", strerror(errno)); return -1; } /* Send first probe after `interval' seconds. */ val = interval; if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &val, sizeof(val)) < 0) { printf("setsockopt TCP_KEEPIDLE: %s ", strerror(errno)); return -1; } /* Send next probes after the specified interval. Note that we set the * delay as interval / 3, as we send three probes before detecting * an error (see the next setsockopt call). */ val = interval/3; if (val == 0) val = 1; if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL, &val, sizeof(val)) < 0) { printf("setsockopt TCP_KEEPINTVL: %s ", strerror(errno)); return -1; } /* Consider the socket in error state after three we send three ACK * probes without getting a reply. */ val = 3; if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT, &val, sizeof(val)) < 0) { printf("setsockopt TCP_KEEPCNT: %s ", strerror(errno)); return -1; } return 0; }
- TCP_KEEPDILE 设置连接上如果没有数据发送的话,多久后发送keepalive探测分组,单位是秒
- TCP_KEEPINTVL 前后两次探测之间的时间间隔,单位是秒
- TCP_KEEPCNT 关闭一个非活跃连接之前的最大重试次数
用tcpdump抓包就可以看到设置了上面选项的那端的详细行为。