zoukankan      html  css  js  c++  java
  • 记一次网络故障——pod间无法通信

    一、背景

    1. 集群是二进制部署
    2. 部署完成后一起正常,各种资源对象均可正常创建、
    3. 部署应用后发现无法跨节点通信,且pod的ip都是172.17.0.0段的

    二、排查过程层

    1. 查看节点路由,发现docker0网卡居然是172.17.0.0段(what?)
    2. 查找如下资料:基于docker的CNM部署flanel时,需要将/run/flannel/subnet.env作为docker的环境变量,且启动时指定flannel的网段信息

    三、解决方案(修改配置文件:/usr/lib/systemd/system/docker.service)

    [Unit]
    Description=Docker Application Container Engine
    Documentation=https://docs.docker.com
    BindsTo=containerd.service
    After=network-online.target firewalld.service containerd.service
    Wants=network-online.target
    Requires=docker.socket
    
    [Service]
    Type=notify
    # the default is not to use systemd for cgroups because the delegate issues still
    # exists and systemd currently does not support the cgroup feature set required
    # for containers run by docker
    EnvironmentFile=/run/flannel/subnet.env
    ExecStart=/usr/bin/dockerd $DOCKER_NETWORK_OPTIONS  -H fd:// --containerd=/run/containerd/containerd.sock
    ExecReload=/bin/kill -s HUP $MAINPID
    TimeoutSec=0
    RestartSec=2
    Restart=always
    
    # Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
    # Both the old, and new location are accepted by systemd 229 and up, so using the old location
    # to make them work for either version of systemd.
    StartLimitBurst=3
    
    # Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
    # Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
    # this option work for either version of systemd.
    StartLimitInterval=60s
    
    # Having non-zero Limit*s causes performance problems due to accounting overhead
    # in the kernel. We recommend using cgroups to do container-local accounting.
    LimitNOFILE=infinity
    LimitNPROC=infinity
    LimitCORE=infinity
    
    # Comment TasksMax if your systemd version does not supports it.
    # Only systemd 226 and above support this option.
    TasksMax=infinity
    
    # set delegate yes so that systemd does not reset the cgroups of docker containers
    Delegate=yes
    
    # kill only the docker process, not all processes in the cgroup
    KillMode=process
    
    [Install]
    WantedBy=multi-user.target

    调用/run/flannel/subnet.env中的DOCKER_NETWORK_OPTIONS指定pod的网段信息

    四、补充

    1. CNI中,docker0的ip与Pod无关,Pod总是生成的时候才去动态的申请自己的IP
    2. CNM模式下,Pod的网段在docker engine启动时就已经决定
    3. 推荐使用CNI模式

    参考地址:https://jiayi.space/post/kubernetescong-ru-men-dao-fang-qi-3-wang-luo-yuan-li

  • 相关阅读:
    Day 20 初识面向对象
    Day 16 常用模块
    Day 15 正则表达式 re模块
    D14 模块 导入模块 开发目录规范
    Day 13 迭代器,生成器,内置函数
    Day 12 递归,二分算法,推导式,匿名函数
    Day 11 闭包函数.装饰器
    D10 函数(二) 嵌套,命名空间作用域
    D09 函数(一) 返回值,参数
    Day 07 Day08 字符编码与文件处理
  • 原文地址:https://www.cnblogs.com/jayce9102/p/12075362.html
Copyright © 2011-2022 走看看