zoukankan      html  css  js  c++  java
  • OpenStack SDN

    Since I have all my OpenStack environment running inside UNetLab, it makes it really easy for me to extend my L3 fabric with a switch from another vendor. In my previous posts I’ve used Cisco and Arista switches to build a 4-leaf 2-spine CLOS fabric. For this task I’ve decided to use a Cumulus VX switch which I’ve downloaded and imported into my lab.

    To simulate the baremetal server (10.0.0.100) I’ve VRF’d an interface on Arista “L4” switch and connected it directly to a “swp3” interface of the Cumulus VX. This is not shown on the diagram.

    Solution overview

    L2 Gateway is a relatively new service plugin for OpenStack Neutron. It provides the ability to interconnect a given tenant network with a VLAN on a physical switch. There are three main components that compose this solution:

    • Hardware switch implementing the OVSDB hardware vtep schema. This is a special “flavour” of OVSDB designed specifically to enable connectivity between logical (VXLAN VTEP) and physical (switchport) interfaces.
    • L2GW agent running on a network node. This is the process responsible for connecting to OVSDB server running on a hardware switch and updating that database based on instructions received from a L2GW service plugin.
    • L2GW Service Plugin residing on a control node. The task of this plugin is to notify the L2GW agent and normal L2 OVS agents running on compute hosts about network events and distribute VTEP IP address information between them.

    Note that in our case both network and control nodes are running on the same VM.

    Cumulux VX configuration

    Cumulux is a debian-based linux distribution, therefore most of the basic networking configuration will be similar to how things are done in Ubuntu. First, let’s start by configuring basic IP addressing on Loopback (VTEP IP), Eth0 (OOB management), swp1 and swp2 (fabric) interfaces.

    iface lo inet loopback
            address 10.0.0.5/32
    
    auto eth0
    iface eth0 inet static
            address 192.168.91.21/24
    
    auto swp1
    iface swp1 inet static
            address 169.254.51.5/24
    
    auto swp2
    iface swp2 inet static
            address 169.254.52.5/24
    
    auto swp3
    iface swp3
    

    Next, let’s enable OSPF

    sudo sed -i s/zebra=no/zebra=yes/ /etc/quagga/daemons
    sudo sed -i s/ospfd=no/ospfd=yes/ /etc/quagga/daemons
    sudo service quagga restart
    

    Once OSPFd is running, we can use sudo vtysh to connect to local quagga shell and finalise the configuration.

    interface lo
     ip ospf area 0.0.0.0
     link-detect
    !
    interface swp1
     ip ospf area 0.0.0.0
     ip ospf network point-to-point
     link-detect
    !
    interface swp2
     ip ospf area 0.0.0.0
     ip ospf network point-to-point
     link-detect
    !
    router ospf
     ospf router-id 10.0.0.5
     passive-interface default
     no passive-interface swp1
     no passive-interface swp2
    

    At this stage our Cumulus VX switch should be fully adjacent to both spines and its loopback IP (10.0.0.5) should be reachable from all OpenStack nodes.

    The final step is to enable the hardware VTEP functionality. The process is fairly simple and involves only a few commands.

    $ sudo sed -i s/START=no/START=yes/g /etc/default/openvswitch-vtep
    $ sudo service openvswitch-vtep start
    $ sudo vtep-bootstrap L5 10.0.0.5 192.168.91.21 --no_encryption
    

    The last command runs a bootstrap script that does the following things:

    • Creates a hardware VTEP OVSDB schema
    • Inside that schema creates a new physical switch called “L5”
    • Sets the VTEP IP to 10.0.0.5
    • Starts listening to incoming OVSDB connections on 192.168.91.21

    Hardware VTEP vs OpenvSwitch OVSDB schemas (Optional)

    By now you’re probably wondering what’s that hardware VTEP OVSDB schema and how it’s different from a normal OVS schema. First of all, remember that OVSDB is just a database and OVSDB protocol is just a set of JSON RPC calls to work with that database. Information that can be stored in the database is defined by a schema - a structure that represents tables and their relations. Therefore, OVSDB can be used to store and manage ANY type of data which makes it very flexible. Specificallly OVS project defines two OVSDB schemas:

    • Open_vSwitch schema - used to manage bridges, ports and controllers of OpenvSwitch. This schema is used by OVS inside every compute host we have in our OpenStack environment.
    • Hardware_vtep schema - designed to be used by physical switches. The goal of this schema is to extend the virtual L2 switch into a physical realm by providing the ability to map physical ports to logical networks. For each logical network the hardware VTEP database holds mappings of MAC addresses to VTEPs and physical switchport.

    The information from these databases is later consumed by another process that sets up the actual bridges and ports. The first schema is used by the ovs-vswitchd process running on all compute hosts to configure ports and flows of integration and tunnel bridges. In case of a Cumulus switch, the information from hardware_vtep OVSDB is used by a process called ovs-vtepd that is responsible for settings up VXLAN VTEP interfaces, provisioning of VLANs on physical switchports and interconnecting them with a Linux bridge.

    If you want to learn more, check out this awesome post about hardware VTEP and OVS.

    OpenStack Control node configuration

    Most of the following procedure has been borrowed from another blog. It’s included it this post because I had to do some modifications and also for the sake of completeness.

    1. Clone the L2GW repository

      git clone -b stable/mitaka https://github.com/openstack/networking-l2gw.git
      
    2. Use pip to install the plugin

      pip install ./networking-l2gw/
      
    3. Enable the L2GW service plugin

      sudo sed -ri 's/^(service_plugins.*)/1,networking_l2gw.services.l2gateway.plugin.L2GatewayPlugin/' 
      /etc/neutron/neutron.conf
      
    4. Copy L2GW configuration files into the neutron configuration directory

      cp  /usr/etc/neutron/l2g* /etc/neutron/
      
    5. Point the L2GW plugin to our Cumulus VX switch

      sudo sed -ri "s/^#s+(ovsdb_hosts).*/1 = 'ovsdb1:192.168.91.21:6632'/" /etc/neutron/l2gateway_agent.ini
      
    6. Update Neutron database with the new schema required by L2GW plugin

      systemctl stop neutron-server
      neutron-db-manage --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/l2gw_plugin.ini  upgrade head
      systemctl start neutron-server
      
    7. Update Neutron startup script to load the L2GW plugin configuration file

      sed -ri "s/(ExecStart=.*)/1 --config-file /etc/neutron/l2gw_plugin.ini /" /usr/lib/systemd/system/neutron-server.service
      
    8. Create a L2GW systemd unit file

      cat >> /usr/lib/systemd/system/neutron-l2gateway-agent.service << EOF
      [Unit]
      Description=OpenStack Neutron L2 Gateway Agent
      After=neutron-server.service
          
      [Service]
      Type=simple
      User=neutron
      ExecStart=/usr/bin/neutron-l2gateway-agent --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/l2gateway_agent.ini
      KillMode=process
          
      [Install]
      WantedBy=multi-user.target
      EOF
      
    9. Restart both L2GW and neutron server

      systemctl daemon-reload
      systemctl restart neutron-server.service
      systemctl start neutron-l2gateway-agent.service  
      
    10. Enter the “neutron configuration mode”

      source ~/keystone_admin
      neutron
      
    11. Create a new L2 gateway device

      l2-gateway-create --device name="L5",interface_names="swp3" CUMULUS-L2GW
      
    12. Create a connection between a “private_network” and a native vlan (dot1q 0) of swp3 interface

      l2-gateway-connection-create --default-segmentation-id 0 CUMULUS-L2GW private_network
      

    Verification and Traffic Flows

    At this stage everything should be ready for testing. We’ll start by examining the following traffic flow:

    • From VM-2 10.0.0.4 / fa:16:3e:d7:0e:14
    • To baremetal server 10.0.0.100 / 50:00:00:6b:2e:70

    The communication starts with VM-2 sending an ARP request for the MAC address of the baremetal server. Packet flow inside the compute host will be exactly the same as before, with packet being flooded from the VM to the integration and tunnel bridges. Inside the tunnel bridge the packet gets resubmitted to table 22 where head-end replication of ARP request takes place.

    The only exception is that this time the frame will get replicated to a new VXLAN port pointing towards the Cumulux VTEP IP. We’ll use the ovs-appctl ofproto/trace command to see the full path a packet takes inside OVS, which is similar to packet-tracer command of Cisco ASA. To simulate an ARP packet we need to specify the incoming port(in_port), EtherType(arp), internal VLAN number for our tenant(dl_vlan) and an ARP request target IP address(arp_tpa). You can find the full list of fields that can be matched in this document.

    $ ovs-appctl ofproto/trace br-tun in_port=1,arp,dl_vlan=1,arp_tpa=10.0.0.100 | grep -E "Rule|actions="
    Rule: table=0 cookie=0xb3c018296c2aa8a3 priority=1,in_port=1
    OpenFlow actions=resubmit(,2)
            Rule: table=2 cookie=0xb3c018296c2aa8a3 priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00
            OpenFlow actions=resubmit(,20)
                    Rule: table=20 cookie=0xb3c018296c2aa8a3 priority=0
                    OpenFlow actions=resubmit(,22)
                            Rule: table=22 cookie=0xb3c018296c2aa8a3 dl_vlan=1
                            OpenFlow actions=strip_vlan,set_tunnel:0x45,output:9,output:4,output:6
    

    The packet leaving port 9 will get encapsulated into a VXLAN header with destination IP of 10.0.0.5 and forwarded out the fabric-facing interface eth1.100. When VXLAN packet reaches the vxln69 interface (10.0.0.5) of the Cumulus switch, the br-vxlan69 Linux bridge floods the frame out the second connected interface - swp3.

    $ brctl show br-vxln69
    bridge name        bridge id          STP enabled     interfaces
    br-vxln69          8000.500000070003  no              swp3
                                                          vxln69
    

    The rest of the story is very simple. When ARP packet hits the baremetal server it populates its ARP cache. A unicast response travels all the way back to the Cumulus switch, gets matched by the static MAC (0e:14) entry created based on information provided by the L2GW plugin. This entry points to the VTEP IP of Compute host 2(10.0.2.10) which is where it gets forwarded next.

    $ bridge fdb show
    50:00:00:09:00:04 dev swp3 vlan 0 master br-vxln69
    50:00:00:07:00:03 dev swp3 vlan 0 master br-vxln69 permanent
    50:00:00:6b:2e:70 dev swp3 vlan 0 master br-vxln69
    26:21:90:a8:8a:cc dev vxln69 vlan 0 master br-vxln69 permanent
    fa:16:3e:57:1c:6c dev vxln69 dst 10.0.3.10 vlan 65535 self permanent
    fa:16:3e:a4:12:e6 dev vxln69 dst 10.0.3.10 vlan 65535 self permanent
    fa:16:3e:d7:0e:14 dev vxln69 dst 10.0.2.10 vlan 65535 self permanent
    fa:16:3e:3c:51:d7 dev vxln69 dst 10.0.1.10 vlan 65535 self permanent
    

    The packet travels through compute host 2, populating the flow entries of all OVS bridges along the way. These entries are then used by subsequent unicast packets travelling from VM-2.

    $ ovs-appctl ofproto/trace br-tun in_port=1,dl_vlan=1,dl_dst=50:00:00:6b:2e:70 | grep -E "Rule|actions="
    Rule: table=0 cookie=0xb5625033061a8ae5 priority=1,in_port=1
    OpenFlow actions=resubmit(,2)
            Rule: table=2 cookie=0xb5625033061a8ae5 priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00
            OpenFlow actions=resubmit(,20)
                    Rule: table=20 cookie=0xb5625033061a8ae5 priority=1,vlan_tci=0x0001/0x0fff,dl_dst=50:00:00:6b:2e:70
                    OpenFlow actions=load:0->NXM_OF_VLAN_TCI[],load:0x45->NXM_NX_TUN_ID[],output:9
    

    It all looks fine until the ARP cache of the baremetal server expires and you get an ARP request coming from the physical into the virtual world. There is a known issue with BUM forwarding which requires a special service node to perform the head-end replication. The idea is that a switch that needs to flood a multicast packet, would send it to a service node which keeps track of all active VTEPs in the network and performs packet replication on behalf of the sender. OpenStack doesn’t have a dedicated service node, however it is possible to trick the network node into performing a similar functionality, which is what I’m going to demonstrate next.

    Programming Network Node as BUM replication service node

    First of all, we need to tell our Cumulus switch to send all multicast packets to the network node. To do that we need to modify OVSDB table called “Mcast_Macs_Remote”. You can view the contents of the database using the ovsdb-client dump --pretty tcp:192.168.91.21:6632 command to make sure that this table is empty. Using the VTEP control command we need to force all unknown-dst (BUM) traffic to go to the network node(10.0.3.10). The UUID of the logical switch can be found with sudo vtep-ctl list-ls command.

    sudo vtep-ctl add-mcast-remote 818b4779-645c-49bb-ae4a-aa9340604019 unknown-dst 10.0.3.10
    

    At this stage all BUM traffic hits the network node and gets flooded to the DHCP and the virtual router namespaces. In order to force this traffic to also be replicated to all compute nodes we can use some of the existing tables of the tunnel bridge. Before we do anything let’s have a look at the tables our ARP request has to go through inside the tunnel bridge.

    table=0, priority=1,in_port=2 actions=resubmit(,4)
    table=4, priority=1,tun_id=0x45 actions=mod_vlan_vid:1,resubmit(,10)
    table=10,priority=1 actions=learn(table=20,hard_timeout=300,priority=1,cookie=0x9f3e746b7ee48bbf,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1
    

    We also have a default head-end replication table 22 which floods all BUM traffic received from the integration bridge to all VTEPs:

    table=22, dl_vlan=1 actions=strip_vlan,set_tunnel:0x45,output:2,output:4,output:6
    

    So what we can do is create a new flow entry that would intercept all ARP packets inside Table 4 and resubmit them to tables 10 and 22. Table 10 will take our packet up to the integration bridge of the network node, since we still need to be able to talk the virtual router and the DHCP. Table 22 will receive a copy of the packet and flood it to all known VXLAN endpoints.

    ovs-ofctl add-flow br-tun "table=4,arp,tun_id=0x45,priority=2,actions=mod_vlan_vid:1,resubmit(,10),resubmit(,22)"
    

    We can once again use the trace command to see the ARP request flow inside the tunnel bridge.

    $ ovs-appctl ofproto/trace br-tun in_port=2,arp,tun_id=0x45 | grep -E "Rule|actions="
    Rule: table=0 cookie=0x9f3e746b7ee48bbf priority=1,in_port=2
    OpenFlow actions=resubmit(,4)
            Rule: table=4 cookie=0 priority=2,arp,tun_id=0x45
            OpenFlow actions=mod_vlan_vid:1,resubmit(,10),resubmit(,22)
                    Rule: table=10 cookie=0x9f3e746b7ee48bbf priority=1
                    OpenFlow actions=learn(table=20,hard_timeout=300,priority=1,cookie=0x9f3e746b7ee48bbf,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1
                            Rule: table=0 cookie=0x91b1a9a9b6e8d608 priority=0
                            OpenFlow actions=NORMAL
                                    Rule: table=0 cookie=0xb36f6e358a37bea6 priority=2,in_port=2
                                    OpenFlow actions=drop
                    Rule: table=22 cookie=0x9f3e746b7ee48bbf dl_vlan=1
                    OpenFlow actions=strip_vlan,set_tunnel:0x45,output:2,output:4,output:6
    

    Now we should be able to clear the ARP cache on baremetal device and successfully ping both VM-2, VM-1 and the virtual router.

    Conclusion

    The workaround presented above is just a temporary solution for the problem. In order to fix the problem properly, OVS vtep schema needs to be updated to support source node replication. Luckily, the patch implementing this functionality has been merged into master OVS branch only a few days ago. So hopefully, this update trickles down to Cumulus package repositories soon.

    Despite all the issues, Neutron L2 gateway plugin is a cool project that provides a very important piece of functionality without having to rely on 3rd party SDN controllers. Let’s hope it will continue to be supported and developed by the community.

    Coming up

    In the next post I was planning to examine another “must have” feature of any SDN solution - Distributed Virtual Routing. However due to my current circumstances I may need to take a few weeks break before going on. Be back soon!

  • 相关阅读:
    干掉你的老板(小游戏)
    SEO优化数据系列表(图)
    javascript动态加载三
    javascript动态加载二
    截屏
    vimdiff
    pscp scp ftp samba windows send files to linux
    login windows 10 with passwd instead of pin
    modify requirements.txt
    整片注释 ,shell
  • 原文地址:https://www.cnblogs.com/dream397/p/13225394.html
Copyright © 2011-2022 走看看