When initialized, TCP starts in the CLOSED state.
Usually an imme- diate transition takes it to either the SYN_SENT or LISTEN state, depending on whether the TCP is asked to perform an active or passive open, respectively.
The ESTABLISHED state is where data transfer can occur between the two ends in both directions.
Simultaneous close, which is a form of double active close, uses the CLOSING state.
The transition from SYN_RCVD back to LISTEN is valid only if the SYN_RCVD state was entered from the LISTEN state (the normal scenario),
not from the SYN_SENT state (a simultaneous open).
This means that if we perform a passive open (enter LISTEN), receive a SYN, send a SYN with an ACK (enter SYN_RCVD), and then receive a reset instead of an ACK, the endpoint returns to the LISTEN state and waits for another connection request to arrive.
TIME_WAIT (2MSL Wait) State
The TIME_WAIT state is also called the 2MSL wait state. It is a state in which TCP waits for a time equal to twice the Maximum Segment Lifetime (MSL),
sometimes called timed wait.
It is the maximum amount of time any segment can exist in the network before being discarded.
We know that this time limit is bounded, because TCP segments are transmitted as IP datagrams,
and the IP datagram has the TTL field or Hop Limit field that limits its effective lifetime
On Linux, the value net.ipv4.tcp_fin_timeout holds the 2MSL wait timeout value (in seconds).
...
This lets TCP resend the final ACK in case it is lost.
The final ACK is resent not because the TCP retransmits ACKs (they do not con- sume sequence numbers and are not retransmitted by TCP),
but because the other side will retransmit its FIN (which does consume a sequence number).
Indeed, TCP will always retransmit FINs until it receives a final ACK.
Another effect of this 2MSL wait state is that while the TCP implementation waits, the endpoints defining that connection
(client IP address, client port num- ber, server IP address, and server port number) cannot be reused.
That connection can be reused only when the 2MSL wait is over, or when a new connection uses an ISN that exceeds the highest sequence number used on the previous instantia- tion of the connection [RFC1122],
or if the use of the Timestamps option allows the disambiguation of segments from a previous connection instantiation to not otherwise be confused [RFC6191].
Unfortunately, some implementations impose a more stringent constraint. In these systems, a local port number cannot be reused while that port number is the local port number of any endpoint that is in the 2MSL wait state on the system.
With the Berkeley sockets API, the SO_REUSEADDR socket option enables the bypass operation.
It lets the caller assign itself a local port number even if that port num- ber is part of some connection in the 2MSL wait state.
We will see, however, that even with this bypass mechanism for one socket (address, port number pair),
the rules of TCP still (should) prevent this port number from being reused by another instantiation of the same connection that is in the 2MSL wait state.
<here is the reason:>
Any delayed segments that arrive for a connection while it is in the 2MSL wait state are dis- carded.
Because the connection defined by the address/port 4-tuple in the 2MSL wait state cannot be reused during this time period,
when a valid connection is finally established, we know that delayed segments from an earlier instantiation of this connection cannot be misinterpreted as being part of the new connection.
The implication(意义,影响) is that if we terminate a client, and restart the same client immediately, that new client cannot reuse the same local port number.
This is not ordinarily a problem, because clients normally use ephemeral ports assigned by the operating system and do not care what the assigned port number is.
This is important to know because a client that makes a large number of connections quickly (especially to the same server) could conceivably have to delay while other connections termi- nate if ephemeral ports are in short supply.
With servers, however, the situation is different.
They almost always use well- known ports. If we terminate a server process that has a connection established and immediately try to restart it, the server cannot assign its assigned port num- ber to its endpoint
(it gets an “Address already in use” binding error), because that port number is part of a connection that is in a 2MSL wait state.
client:
...
As we expect, the client cannot do this, because port 2091 is part of a connection that is in a 2MSL wait state.
Once the wait is over (1 minute on this Linux machine), the client attempts to connect again, but the server exited when the connection was interrupted the first time,
so it is refused. We shall see how TCP reset segments are used to signal this connection refused condition in Section 13.6.
Here we see that even though the same connection (4-tuple) is being used again before the 2MSL wait state expires,
the use of the -A option has forced the connection to be allowed.
Of course, this is all taking place on the same computer.
What if we try the same thing again but establish the connection from another host?
We observe that irrespective of the -A flag on the client, the 2MSL wait time is induced.
After that, the client attempts to contact the server, which has already exited.
One interesting thing happens if we switch the client and server machines.
We will now use Windows as the server and Linux as the client and repeat the experiment.
connect() error: Connection refused -> means client do not work
bind() error: Connection refused -> means client works, but server not work
At this point we would expect local port 32843 to be unavailable, but because of the way -A works on Linux, we are allowed to make use of it.
This is a violation of the original TCP specification, but it is allowed by [RFC1122] and [RFC6191], as mentioned before(those convincing facts).
These specifications allow a new connection request to arrive and be accepted for a connection that is in the TIME_WAIT state,
if there is a strong reason to believe that segments on the new connection will not be confused with segments on the previous instantiation of the connection
based on a combi- nation of the sequence numbers and timestamps.
Quiet Time Concept
But this(2MSL protection mechimism ) works only if a host with connections in the 2MSL wait does not crash.
What if a host with connections in the TIME_WAIT state crashes, reboots within the MSL, and immediately establishes new connections using the same local and foreign IP addresses and port numbers corresponding to the local con- nections
that were in the TIME_WAIT state before the crash?
In this scenario, delayed segments from the connections that existed before the crash can be mis- interpreted as belonging to the new connections created after the reboot.
This can happen regardless of how the initial sequence number is chosen after the reboot.
To protect against this scenario, [RFC0793] states that TCP should wait an amount of time equal to the MSL before creating any new connections after a reboot or crash.
This is called the quiet time.
Few implementations abide by this because most hosts take longer than the MSL to reboot after a crash.
Also, if appli- cations use their own checksums or encryption, errors such as these are easily detected.
FIN_WAIT_2 State
In the FIN_WAIT_2 state, TCP has sent a FIN and the other end has acknowledged it.
Only when the application(the passive open end) performs this close (and its FIN is received) does the active closing TCP move from the FIN_WAIT_2 to the TIME_WAIT state.
This means that one end of the connection can remain in this state forever.
The other end is still in the CLOSE_WAIT state and can remain there forever, until the application decides to issue its close.
Many implementations prevent this infinite wait in the FIN_WAIT_2 state as follows:
If the application that does the active close does a complete close, not a half-close indicating that it expects to receive data, a timer is set.
If the connection is idle when the timer expires, TCP moves the connection into the CLOSED state.
In Linux, the variable net.ipv4.tcp_fin_timeout can be adjusted to control the number of seconds to which the timer is set. Its default value is 60s.
Simultaneous Open and Close Transitions
TODO