背景
之前一内网测试环境,因想尝试下使用calico,所以安装calico作为kubernetes的网络插件,最近发现kubelet日志频繁报错,大致格式如下:
StopPodSandbox $SHA from runtime service failed: rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod <pod-name> network: CNI failed to retrieve network namespace path: Error: No such container: $SHA
虽然是测试环境,不影响使用,但本着解决问题的态度,google了一把,网上还真有类似的问题反馈:
Network namespace path is nil when using hostnetwork. Consequence: The teardown in such a case gives out error messages. Fix: Upstream fix - do not let CNI manage containers with hostnetwork=true.
参考:https://bugzilla.redhat.com/show_bug.cgi?id=1507257
总结的原因就是CNI的管理容器不能设置hostnetwork=true:
之前启动calico的yaml文件,由于是参考官方的hosted安装的 安装方式(官方链接),所以要改为其它的安装方式,这里未去试验其它的方式。索性直接使用flannel替换得了,calico弄着也够复杂的。
替换过程
1. 卸载calico
#!/bin/bash kubectl --namespace=kube-system delete ds calico-node kubectl --namespace=kube-system delete deploy calico-policy-controller kubectl --namespace=kube-system delete sa calico-node kubectl --namespace=kube-system delete sa calico-policy-controller kubectl --namespace=kube-system delete cm calico-config kubectl --namespace=kube-system delete secret calico-etcd-secrets
2.删除kubelet启动服务中,有关calico的配置。删除后重启kubelet
--network-plugin=cni --cni-conf-dir=$cni_conf --cni-bin-dir=$cni_bin_dir
3.执行命令:ifconfig tunl0 down
4.以下开始安装flannel
项目地址:https://github.com/coreos/flannel //下载的版本是v0.7.1 wget https://github.com/coreos/flannel/releases/download/v0.7.1/flannel-v0.7.1-linux-amd64.tar.gz //解压 tar xf flannel-v0.7.1-linux-amd64.tar.gz
5.flannel启动命令
#!/bin/bash flanneld --etcd-endpoints https://172.17.58.1:2379 --etcd-cafile=/etc/kubernetes-cluster/ssl/ca.pem --etcd-certfile=/etc/kubernetes-cluster/ssl/kubernetes.pem --etcd-keyfile=/etc/kubernetes-cluster/ssl/kubernetes-key.pem >> /tmp/flanneld.log 2>&1 &
6.设置etcd集群
//为操作etcd方便些,制作一个脚本 echo ' etcdctl --ca-file=/etc/kubernetes-cluster/ssl/ca.pem --cert-file=/etc/kubernetes-cluster/ssl/kubernetes.pem --key-file=/etc/kubernetes-cluster/ssl/kubernetes-key.pem --endpoints=https://127.0.0.1:2379 $@' > /usr/local/bin/etcdctl.sh //etcd中创建存放子网段的目录 etcdctl.sh etcdctl.sh mkdir /coreos.com/network/ //设置子网段 etcdctl.sh set /coreos.com/network/config '{"Network":"10.10.0.0/16"}'
7.经过以下步骤,flannel会生成subnet.env文件
cat /var/run/flannel/subnet.env FLANNEL_NETWORK=10.10.0.0/16 FLANNEL_SUBNET=10.10.72.1/24 FLANNEL_MTU=1450 FLANNEL_IPMASQ=false
8.修改docker的网关地址为:10.10.72.1
cat > /etc/default/docker << EOF DOCKER_OPTS="-H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock --bip=10.10.72.1/24" EOF
9.重启docker,收工