部署 coredns 插件(在master节点上执行)
-
下载和配置 coredns
cd /opt/k8s/work git clone https://github.com/coredns/deployment.git mv deployment coredns
-
启动 coredns
cd /opt/k8s/work/coredns/kubernetes export CLUSTER_DNS_SVC_IP="10.254.0.2" export CLUSTER_DNS_DOMAIN="cluster.local" ./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f -
-
遇到问题
启动coredns后,状态是CrashLoopBackOff
root@master:/opt/k8s/work/coredns/kubernetes# kubectl get pod -n kube-system -l k8s-app=kube-dns NAME READY STATUS RESTARTS AGE coredns-76b74f549-99bxd 0/1 CrashLoopBackOff 5 4m45s
查看coredns对应的pod日志有如下错误
root@master:/opt/k8s/work/coredns/kubernetes# kubectl -n kube-system logs coredns-76b74f549-99bxd .:53 [INFO] plugin/reload: Running configuration MD5 = 8b19e11d5b2a72fb8e63383b064116a1 CoreDNS-1.6.6 linux/amd64, go1.13.5, 6a7a75e [FATAL] plugin/loop: Loop (127.0.0.1:60429 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 6292641803451309721.7599235642583168995."
按照提示进入https://coredns.io/plugins/loop#troubleshooting页面,有如下表述
When a CoreDNS Pod deployed in Kubernetes detects a loop, the CoreDNS Pod will start to “CrashLoopBackOff”. This is because Kubernetes will try to restart the Pod every time CoreDNS detects the loop and exits.
A common cause of forwarding loops in Kubernetes clusters is an interaction with a local DNS cache on the host node (e.g. systemd-resolved). For example, in certain configurations systemd-resolved will put the loopback address 127.0.0.53 as a nameserver into /etc/resolv.conf. Kubernetes (via kubelet) by default will pass this /etc/resolv.conf file to all Pods using the default dnsPolicy rendering them unable to make DNS lookups (this includes CoreDNS Pods). CoreDNS uses this /etc/resolv.conf as a list of upstreams to forward requests to. Since it contains a loopback address, CoreDNS ends up forwarding requests to itself.
There are many ways to work around this issue, some are listed here:- Add the following to your kubelet config yaml: resolvConf:
(or via command line flag --resolv-conf deprecated in 1.10). Your “real” resolv.conf is the one that contains the actual IPs of your upstream servers, and no local/loopback address. This flag tells kubelet to pass an alternate resolv.conf to Pods. For systems using systemd-resolved, /run/systemd/resolve/resolv.conf is typically the location of the “real” resolv.conf, although this can be different depending on your distribution. - Disable the local DNS cache on host nodes, and restore /etc/resolv.conf to the original.
- A quick and dirty fix is to edit your Corefile, replacing forward . /etc/resolv.conf with the IP address of your upstream DNS, for example forward . 8.8.8.8. But this only fixes the issue for CoreDNS, kubelet will continue to forward the invalid resolv.conf to all default dnsPolicy Pods, leaving them unable to resolve DNS.
按照提示的第一种解决方法,修改kubelet对应的配置文件kubelet-config.yaml中resolv-conf的值为/run/systemd/resolve/resolv.conf,配置片段如下
... podPidsLimit: -1 resolvConf: /run/systemd/resolve/resolv.conf maxOpenFiles: 1000000 ...
重启kubelet服务
systemctl daemon-reload systemctl restart kubelet
之后重新部署coredns
root@master:/opt/k8s/work/coredns/kubernetes# ./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f - serviceaccount/coredns created clusterrole.rbac.authorization.k8s.io/system:coredns created clusterrolebinding.rbac.authorization.k8s.io/system:coredns created configmap/coredns created deployment.apps/coredns created service/kube-dns created root@master:/opt/k8s/work/coredns/kubernetes# kubectl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-76b74f549-j5t9c 1/1 Running 0 12s root@master:/opt/k8s/work/coredns/kubernetes# kubectl get all -n kube-system -l k8s-app=kube-dns NAME READY STATUS RESTARTS AGE pod/coredns-76b74f549-j5t9c 1/1 Running 0 2m8s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP,9153/TCP 2m8s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/coredns 1/1 1 1 2m8s NAME DESIRED CURRENT READY AGE replicaset.apps/coredns-76b74f549 1 1 1 2m8s
- Add the following to your kubelet config yaml: resolvConf:
-
启动一个busybox pod,并启动上一章节中验证集群功能的nginx服务,在busybox通过服务名,访问nginx服务
cd /opt/k8s/yml cat > busybox.yml << EOF apiVersion: v1 kind: Pod metadata: name: busybox spec: containers: - name: busybox image: busybox command: - sleep - "3600" EOF kubectl create -f busybox.yml kubectl create -f nginx.yml
-
进入busybox pod中访问nginx
root@master:/opt/k8s/yml# kubectl exec -it busybox sh / # cat /etc/resolv.conf nameserver 10.254.0.2 search default.svc.cluster.local svc.cluster.local cluster.local options ndots:5 / # nslookup www.baidu.com Server: 10.254.0.2 Address: 10.254.0.2:53 Non-authoritative answer: www.baidu.com canonical name = www.a.shifen.com Name: www.a.shifen.com Address: 183.232.231.174 Name: www.a.shifen.com Address: 183.232.231.172 / # nslookup kubernetes Server: 10.254.0.2 Address: 10.254.0.2:53 Name: kubernetes.default.svc.cluster.local Address: 10.254.0.1 / # nslookup nginx Server: 10.254.0.2 Address: 10.254.0.2:53 Name: nginx.default.svc.cluster.local Address: 10.254.19.32 / # ping -c 1 nginx PING nginx (10.254.19.32): 56 data bytes 64 bytes from 10.254.19.32: seq=0 ttl=64 time=0.155 ms --- nginx ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 0.155/0.155/0.155 ms