对于TIMEWAIT以及FIN_WAIT_2 状态中 TCP协议栈的处理可以参考这篇文章:主动关闭TCP如何处理
对于CLOSE_WAIT LAST_ACK FIN_WAIT1 CLOSING等状态的处理,见如下:
在主动关闭方发送了FIN之后,进入FIN_WAIT_1状态,在此状态收到了ACK,则进入FIN_WAIT_2状态:
int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) { /* step 5: check the ACK field */ acceptable = tcp_ack(sk, skb, FLAG_SLOWPATH | FLAG_UPDATE_TS_RECENT) > 0; switch (sk->sk_state) { case TCP_FIN_WAIT1: { int tmo; /* If we enter the TCP_FIN_WAIT1 state and we are a * Fast Open socket and this is the first acceptable * ACK we have received, this would have acknowledged * our SYNACK so stop the SYNACK timer. */ if (req) { /* We no longer need the request sock. */ reqsk_fastopen_remove(sk, req, false); tcp_rearm_rto(sk); } /* 发送数据未确认完毕 则 跳出等待ack*/ if (tp->snd_una != tp->write_seq) break; // 由FIN_WAIT_1切换到FIN_WAIT_2 tcp_set_state(sk, TCP_FIN_WAIT2); sk->sk_shutdown |= SEND_SHUTDOWN;/* 关闭发送端 */ sk_dst_confirm(sk); /* 路由缓存pending确认 */ if (!sock_flag(sk, SOCK_DEAD)) { /* Wake up lingering close() 调用 tcp_close的时候就回调用 sock_orphan(sk); 最后调用sock_set_flag(sk, SOCK_DEAD); 设置dead状态*/ sk->sk_state_change(sk);/* 套接口不是DEAD状态,状态发生变化,唤醒等待进程 */ break; } if (tp->linger2 < 0 || /* linger2<0,无需在FIN_WAIT_2等待 */ (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq && /* 收到期望序号以后的数据段(data, fin) */ after(TCP_SKB_CB(skb)->end_seq - th->fin, tp->rcv_nxt))) { tcp_done(sk); NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPABORTONDATA); return 1; } tmo = tcp_fin_time(sk);/* 获取FIN_WAIT_2等待时间 */ if (tmo > TCP_TIMEWAIT_LEN) { /* > TIMEWAIT_LEN,加入FIN_WAIT_2定时器 */ inet_csk_reset_keepalive_timer(sk, tmo - TCP_TIMEWAIT_LEN); } else if (th->fin || sock_owned_by_user(sk)) { /* Bad case. We could lose such FIN otherwise. * It is not a big problem, but it looks confusing * and not so rare event. We still can lose it now, * if it spins in bh_lock_sock(), but it is really * marginal case. *//* 有fin 或者 被用户进程锁定,加入FIN_WAIT_2定时器 */ inet_csk_reset_keepalive_timer(sk, tmo); } else { /* 正常等待时间< TIMEWAIT_LEN,进入TIMEWAIT接管状态 但是其tw_substate 状态为TCP_FIN_WAIT2 */ tcp_time_wait(sk, TCP_FIN_WAIT2, tmo); goto discard; } break; }
tcp_rcv_state_process函数中对于ack的处理步骤中,假如连接处于FIN_WAIT_1,且数据均已经被确认完,则进入TIME_WAIT_2状态;如果无需在该状态等待(linger2<0),或者收到了乱序数据段,则直接关闭连接;如果需要等待,则需要判断等待时间与TIMEWAIT时间的大小关系,若>TIMEWAIT_LEN,则添加TIME_WAIT_2定时器,否则直接进入TIME_WAIT接管(其子状态仍然是FIN_WAIT_2),接管之后会添加TIME_WAIT定时器;
tcp_close函数调用时,如果当前状态是FIN_WAIT_2也会用相似方式进入TIME_WAIT接管
对于socket的状态只有:tcp_close 以及 sk_common_release 会去调用sock_orphan 设置socket为dead 状态
/* Detach socket from process context. * Announce socket dead, detach it from wait queue and inode. * Note that parent inode held reference count on this struct sock, * we do not release it in this function, because protocol * probably wants some additional cleanups or even continuing * to work with this socket (TCP). */ static inline void sock_orphan(struct sock *sk) { write_lock_bh(&sk->sk_callback_lock); sock_set_flag(sk, SOCK_DEAD); sk_set_socket(sk, NULL); sk->sk_wq = NULL; write_unlock_bh(&sk->sk_callback_lock); }
CLOSING 状态:
如果通过ack 确认所有发送的数据(包含fin)对方已经收到, 则从closeing迁移到time_wait状态;等待2msl 超时回收
case TCP_CLOSING: if (tp->snd_una == tp->write_seq) { tcp_time_wait(sk, TCP_TIME_WAIT, 0); goto discard; } break;
LAST_ACK状态:
case TCP_LAST_ACK: if (tp->snd_una == tp->write_seq) { tcp_update_metrics(sk); tcp_done(sk); goto discard; } break;
通过确认所有的数据包含fin 对方都已经收到, 则从last_ack 迁移到close 状态,并更新tcp运行相关参数主要是拥塞控制信息为后续链接拥塞控制提供指导参考;关于TCP Metrics 可以参考:https://yacanliu.gitee.io/tcp-metrics
/* step 6: check the URG bit */ tcp_urg(sk, skb, th); /* step 7: process the segment text */ switch (sk->sk_state) { case TCP_CLOSE_WAIT: case TCP_CLOSING: case TCP_LAST_ACK: if (!before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) break; case TCP_FIN_WAIT1: case TCP_FIN_WAIT2: /* RFC 793 says to queue data in these states, * RFC 1122 says we MUST send a reset. * BSD 4.4 also does reset. */ if (sk->sk_shutdown & RCV_SHUTDOWN) { if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq && after(TCP_SKB_CB(skb)->end_seq - th->fin, tp->rcv_nxt)) { NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPABORTONDATA); tcp_reset(sk); return 1; } } /* Fall through */ case TCP_ESTABLISHED: tcp_data_queue(sk, skb); queued = 1; break; }
对于:TCP_CLOSE_WAIT TCP_CLOSING TCP_LAST_ACK 状态 收到以前的数据 时 可以直接丢弃;
如果接收方已经关闭,但是又收到新的数据, 则给对端发送rst
timewait的2msl 超时处理
TIME_WAIT定时器超时触发,定时器超时,将tw控制块从ehash和bhash中删除,在收到数据段会发送reset;
inet_twsk_kill从ehash和bhash中把tw控制块删除,并释放
static void tw_timer_handler(unsigned long data) { struct inet_timewait_sock *tw = (struct inet_timewait_sock *)data; if (tw->tw_kill) NET_INC_STATS_BH(twsk_net(tw), LINUX_MIB_TIMEWAITKILLED); else NET_INC_STATS_BH(twsk_net(tw), LINUX_MIB_TIMEWAITED); inet_twsk_kill(tw); }
/** * tcp_rcv_state_process * |-->tcp_time_wait * |-->inet_twsk_alloc * | |-->setup_pinned_timer(&tw->tw_timer, tw_timer_handler,(unsigned long)tw); * |-->__inet_twsk_schedule(tw, timeo, false); * |-->mod_timer(&tw->tw_timer, jiffies + timeo); */ /* Must be called with locally disabled BHs. */ static void inet_twsk_kill(struct inet_timewait_sock *tw) { struct inet_hashinfo *hashinfo = tw->tw_dr->hashinfo; spinlock_t *lock = inet_ehash_lockp(hashinfo, tw->tw_hash); struct inet_bind_hashbucket *bhead; spin_lock(lock); sk_nulls_del_node_init_rcu((struct sock *)tw); spin_unlock(lock); /* Disassociate with bind bucket. */ bhead = &hashinfo->bhash[inet_bhashfn(twsk_net(tw), tw->tw_num, hashinfo->bhash_size)]; spin_lock(&bhead->lock); inet_twsk_bind_unhash(tw, hashinfo); spin_unlock(&bhead->lock); atomic_dec(&tw->tw_dr->tw_count); inet_twsk_put(tw); }
/* 控制块加入该端口的使用者列表 */ 125 inet_bind_hash(sk, tb, port);
每个使用的port 都会加入bind_hash 所有release的时候就应该回收