zoukankan      html  css  js  c++  java
  • 排查 “Detected Tx Unit Hang”问题

    实现功能:

    使用自己已经分配的内存让skb->data指向,而不是使用alloc_malloc()。

    部分代码如下:   

     1             /*
     2              * build a new sk_buff
     3              */
     4             //struct sk_buff *send_skb = kmem_cache_alloc_node(skbuff_head_cache, GFP_ATOMIC & ~__GFP_DMA, NUMA_NO_NODE);
     5             struct sk_buff *send_skb = kmem_cache_alloc(skbuff_head_cache, GFP_ATOMIC & ~__GFP_DMA);
     6 
     7             if (!send_skb) {
     8                 //spin_unlock(&lock);
     9                 return NF_DROP;
    10             }
    11             
    12             //printk("what2
    ");
    13             memset(send_skb, 0, offsetof(struct sk_buff, tail));
    14             atomic_set(&send_skb->users, 2);
    15             send_skb->cloned = 0;
    16             
    17             send_skb->head = mmap_buf + 1024;
    18             send_skb->data = mmap_buf + 1024;
    19             

    第18行,mmap_buf是提前分配的内存。

    在/var/log/messages中网卡驱动会输出错误信息:

     1 ep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang
     2 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <13>
     3 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <0>, <1ea>
     4 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1ea>
     5 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <0>
     6 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang
     7 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <15>
     8 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <1>, <1eb>
     9 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1eb>
    10 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <1>
    11 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang
    12 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <14>
    13 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <0>, <1ea>
    14 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1ea>
    15 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <0>
    16 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang
    17 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <4>
    18 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <0>, <1ea>
    19 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1ea>
    20 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <0>
    21 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang
    22 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <12>
    23 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <5>, <1ef>
    24 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1ef>
    25 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <5>
    26 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang
    27 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <2>
    28 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <2>, <1ec>
    29 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1ec>
    30 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <2>
    31 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang

    在排除各种原因后,定位为分配的mmap_buf存在问题。使用vmalloc()分配不正确,改为kmalloc()后正常。

    《Linux内核设计与实现》第12.5节有解释,应该是:网卡设备要求分配的物理地址连续,而vmalloc()只是虚拟地址连续

  • 相关阅读:
    兄弟连学python(1)——MySQL
    运算和运算符相关知识
    关于python中的快捷键
    关于爬虫
    Hello Python
    [ARC101C] Ribbons on Tree
    CF568E Longest Increasing Subsequence
    2021省选游记
    [NEERC2015]Distance on Triangulation
    dp的一些优化
  • 原文地址:https://www.cnblogs.com/lxgeek/p/4042683.html
Copyright © 2011-2022 走看看