gdb了ovs的代码,发现是 dpdk的imiss计数在不断的丢包。
看了ovs-openvswitchd的日志,重启时发现如下行:
3590 2018-05-21T11:57:03.427Z|00033|timeval|WARN|Unreasonably long 22418ms poll interval (474ms user, 21612ms system) 3591 2018-05-21T11:57:03.427Z|00034|timeval|WARN|faults: 141393 minor, 0 major 3592 2018-05-21T11:57:03.427Z|00035|timeval|WARN|disk: 0 reads, 16 writes 3593 2018-05-21T11:57:03.427Z|00036|timeval|WARN|context switches: 14 voluntary, 120 involuntary
开启debug
[root@vrouter1 ~]# ovs-appctl vlog/set file:dbg
重装新版dpdk
[root@vrouter1 ovs-dpdk]# ls dpdk-17.11.2.tar.xz dpdk-stable-17.11.2 openvswitch-2.9.1 openvswitch-2.9.1.tar.gz
1. 编译dpdk
[root@vrouter1 dpdk-stable-17.11.2]# make config T=$RTE_TARGET O=$RTE_TARGET Configuration done using x86_64-native-linuxapp-gcc [root@vrouter1 dpdk-stable-17.11.2]# cd x86_64-native-linuxapp-gcc/ [root@vrouter1 x86_64-native-linuxapp-gcc]# make
2. 编译ovs
[root@vrouter1 openvswitch-2.9.1]# ./boot.sh [root@vrouter1 openvswitch-2.9.1]# ./configure --with-dpdk=$RTE_SDK/$RTE_TARGET [root@vrouter1 openvswitch-2.9.1]# make [root@vrouter1 openvswitch-2.9.1]# make install
3. 运行
[root@vrouter1 ovs-dpdk]# cat ovs.sh export PATH=$PATH:/usr/local/share/openvswitch/scripts export DB_SOCK=/usr/local/var/run/openvswitch/db.sock ovs-ctl --no-ovs-vswitchd start #ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true #ovs-ctl --no-ovsdb-server --db-sock="$DB_SOCK" start ovs-ctl --no-ovsdb-server start [root@vrouter1 ovs-dpdk]#
4. 配置
[root@vrouter1 Datapath]# dpdk-devbind -b vfio-pci 0000:01:00.0 [root@vrouter1 ovs-dpdk]# ovs-vsctl add-br br-phy -- set bridge br-phy datapath_type=netdev [root@vrouter1 ovs-dpdk]# ovs-vsctl add-port br-phy dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:01:00.0
[root@vrouter1 ovs-dpdk]# ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev [root@vrouter1 ovs-dpdk]# ovs-vsctl add-port br0 vxlan0 -- set Interface vxlan0 type=vxlan options:remote_ip=10.0.0.163 options:local_ip=10.0.0.161 options:in_key=flow options:out_key=flow
#>ovs-appctl ovs/route/add 10.0.0.163/24 br-phy [root@vrouter1 ~]# ip a add 10.0.0.161/24 dev br-phy
5. 绑定dpdk core
[root@vrouter1 ~]# ovs-vsctl set Interface dpdk-p0 options:n_rxq=4 [root@vrouter1 ~]# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x154
6. 不丢包了。
7. vhost user client
7.1 启动vhost iommu
[root@vrouter1 ~]# ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
8 添加vhostuserclient网卡
[root@vrouter1 ~]# ovs-vsctl add-port br0 vhost0 -- set Interface vhost0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/nlb_vm0.sock [root@vrouter1 ~]# ovs-vsctl add-port br0 vhost1 -- set Interface vhost1 type=dpdkvhostuserclient options:vhost-server-path=/tmp/nlb_vm1.sock
9 加流表
[root@vrouter1 ~]# ovs-appctl dpif/show netdev@ovs-netdev: hit:31940814928 missed:234 br-phy: br-phy 65534/1: (tap) dpdk-p0 1/2: (dpdk: configured_rx_queues=4, configured_rxq_descriptors=2048, configured_tx_queues=7, configured_txq_descriptors=2048, lsc_interrupt_mode=false, mtu=1500, requested_rx_queues=4, requested_rxq_descriptors=2048, requested_tx_queues=7, requested_txq_descriptors=2048, rx_csum_offload=true) br0: br0 65534/3: (tap) vhost0 2/5: (dpdkvhostuserclient: configured_rx_queues=1, configured_tx_queues=1, mtu=1500, requested_rx_queues=1, requested_tx_queues=1) vhost1 3/6: (dpdkvhostuserclient: configured_rx_queues=1, configured_tx_queues=1, mtu=1500, requested_rx_queues=1, requested_tx_queues=1) vxlan0 1/4: (vxlan: key=flow, local_ip=10.0.0.161, remote_ip=10.0.0.163)
[root@vrouter1 ~]# ovs-ofctl add-flow br0 "cookie=0x1111,table=0, priority=100, tun_id=200,dl_dst=00:00:00:11:22:41,nw_dst=192.168.77.161,actions=move:NXM_NX_TUN_ID[0..23]->NXM_NX_REG0[0..23],resubmit(,1)" [root@vrouter1 ~]# ovs-ofctl add-flow br0 "cookie=0x1111,table=0, priority=100, tun_id=200,dl_dst=00:00:00:11:22:41,nw_dst=192.168.77.161,actions=move:NXM_NX_TUN_ID[0..23]->NXM_NX_REG0[0..23],resubmit(,1)"
10, 查看队列与core的mapping关系
[root@vrouter1 ~]# ovs-appctl dpif-netdev/pmd-rxq-show pmd thread numa_id 0 core_id 2: isolated : false port: dpdk-p0 queue-id: 0 pmd usage: 77 % port: vhost1 queue-id: 2 pmd usage: 0 % port: vhost1 queue-id: 3 pmd usage: 0 % pmd thread numa_id 0 core_id 4: isolated : false port: dpdk-p0 queue-id: 3 pmd usage: 77 % port: vhost1 queue-id: 0 pmd usage: 0 % pmd thread numa_id 0 core_id 6: isolated : false port: dpdk-p0 queue-id: 1 pmd usage: 78 % port: vhost0 queue-id: 4 pmd usage: 0 % pmd thread numa_id 0 core_id 8: isolated : false port: vhost0 queue-id: 0 pmd usage: 0 % port: vhost0 queue-id: 3 pmd usage: 0 % pmd thread numa_id 0 core_id 10: isolated : false port: dpdk-p0 queue-id: 2 pmd usage: 77 % port: vhost1 queue-id: 1 pmd usage: 0 % port: vhost1 queue-id: 4 pmd usage: 0 % pmd thread numa_id 0 core_id 12: isolated : false port: vhost0 queue-id: 1 pmd usage: 0 % port: vhost0 queue-id: 2 pmd usage: 0 %
总结:
丢包只要是丢在了内核,因为top的时候看绑定core的cpu占用,可以看见大约80%的占用是sys,20%是user
正常的情况是包都在dpdk用户态走,所有应该100%是user。
理解了路由,流表,vxlan的原理之后,可以逐个梳理,保证包不会被流转进内核,便可以消除丢包。
总之,原因就是由于流表路由的设置问题使数据包被转发入了内核。