zoukankan      html  css  js  c++  java
  • Linux kernel TCP smoothedRTT estimation

    https://strugglingcoder.info/index.php/linux-kernel-tcp-smoothed-rtt-estimation/

    Posted: February 18th, 2018 | Author: hiren | Filed under: Linux, networking, tcp | Tags: linux, networking, rtt, srtt, tcp | Comments Off on Linux kernel TCP smoothed-RTT estimation
    Recently I decided to look under the hood to see how exactly srtt is calculated in Linux. Actual (Exponentially Weighted Moving Average) srtt calculation is a rather straight-forward part but what goes in as input to that calculation under various scenarios is interesting and very important in getting correct rtt estimate.

    Also useful to note the difference between Linux and FreeBSD in this regard. Linux doesn’t trust tcp packet Timestamps option provided value whenever possible as middle-boxes can meddle with it.

    Basic algorithm is:
    For non-retransmitted packets, use saved packet send timestamp and ack arrival time.
    For retransmitted packets, use timestamp option and if that’s not enabled, rtt is not calculated for such packets.

    Let’s look at the code. I am using net-next.
    When a TCP sender sends packets, it has to wait for acks for those packets before throwing them away. It stores them in a queue called ‘retransmission queue’.
    When sent packets get acked, tcp_clean_rtx_queue() gets called to clear those packets from the retransmission queue.

    A few useful variables in that function are:
    seq_rtt_us – uses first packet from ackd range
    ca_rtt_us – uses last packet from ackd range (mainly used for congestion control)
    sack_rtt_us – uses sacked ack
    tcp_mstamp is a tcp_sock member which represents timestamp of most recent packet received/sent. It gets updated by tcp_mstamp_refresh().

    For a clean ack (not sack), seq_rtt_us = ca_rtt_us (as there is no range)

    If such a clean is also for a non-retransmitted packet,
    [sourcecode language=”c”]seq_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, first_ackt);[/sourcecode]

    and for a sack which is again for a non-retransmitted packet,
    [sourcecode language=”c”]sack_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, sack->first_sackt);[/sourcecode]

    Code that updates sack→first_sackt is in tcp_sacktag_one() where it gets populated when the sack is for a non-retransmitted packet.

    tcp_stamp_us_delta() gets the difference with timestamp that the stack maintains.

    Now tcp_ack_update_rtt() gets called which starts out with:
    [sourcecode language=”c”]
    /* Prefer RTT measured from ACK’s timing to TS-ECR. This is because
    * broken middle-boxes or peers may corrupt TS-ECR fields. But
    * Karn’s algorithm forbids taking RTT if some retransmitted data
    * is acked (RFC6298).
    */
    if (seq_rtt_us < 0)
    seq_rtt_us = sack_rtt_us;
    [/sourcecode]

    For acks acking retransmitted packets, seq_rtt_us would be -ve.
    But if there is a SACK timestamp from a non-retransmitted packet, it would use that as it carries valid and useful timestamps.

    Then it takes TS-opt provided timestamps only if seq_rtt_us is -ve.
    [sourcecode language=”c”]
    if (seq_rtt_us < 0 && tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
    flag & FLAG_ACKED) {
    u32 delta = tcp_time_stamp(tp) – tp->rx_opt.rcv_tsecr;
    u32 delta_us = delta * (USEC_PER_SEC / TCP_TS_HZ);

    seq_rtt_us = ca_rtt_us = delta_us;
    }
    [/sourcecode]

    By this point, there is seq_rtt_us that can be fed into tcp_rtt_estimator() that’d generate smoothed-RTT (which is more or less based on SIGCOMM 88 paper by Van Jacobson).


  • 相关阅读:
    Python基础之:数字字符串和列表
    【Flutter 实战】自定义动画-涟漪和雷达扫描
    【Flutter 实战】动画序列、共享动画、路由动画
    kubernetes备份恢复之velero
    Go语言中使用K8s API及一些常用API整理
    Go SDK 操作Docker
    Kubernetes中各组件简介(一)
    HTTPS协议原理解析
    树莓派无屏上手指南
    如何优雅的进行版本回退
  • 原文地址:https://www.cnblogs.com/ztguang/p/15646961.html
Copyright © 2011-2022 走看看