zoukankan      html  css  js  c++  java
  • kubernetes二进制部署k8s-master集群controller-manager服务unhealthy问题

    一.问题现象

    我们使用二进制部署k8s的高可用集群时,在部署多master时,kube-controller-manager服务提示Unhealthy

    [root@ceph-01 system]# kubectl get cs
    NAME                 STATUS      MESSAGE                                                                                                                                  ERROR
    scheduler            Healthy     ok                                                                                                                                       
    controller-manager   Unhealthy   Get http://127.0.0.1:10252/healthz: net/http: HTTP/1.x transport connection broken: malformed HTTP response "x15x03x01x00x02x02"   
    etcd-1               Healthy     {"health":"true"}                                                                                                                        
    etcd-0               Healthy     {"health":"true"}                                                                                                                        
    etcd-2               Healthy     {"health":"true"}                          
    

    这里我们查看得知kube-controller-manager的服务运行时提示有一些日志报错问题:

    [root@ceph-01 system]# systemctl status kube-controller-manager -l
    ● kube-controller-manager.service - Kubernetes Controller Manager
       Loaded: loaded (/etc/systemd/system/kube-controller-manager.service; enabled; vendor preset: disabled)
       Active: active (running) since Sat 2018-12-29 03:56:00 EST; 31min ago
         Docs: https://github.com/GoogleCloudPlatform/kubernetes
     Main PID: 126295 (kube-controller)
        Tasks: 8
       Memory: 8.4M
       CGroup: /system.slice/kube-controller-manager.service
               └─126295 /usr/local/bin/kube-controller-manager --port=0 --secure-port=10252 --bind-address=127.0.0.1 --kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig --authentication-kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig --service-cluster-ip-range=10.254.0.0/16 --cluster-name=kubernetes --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem --experimental-cluster-signing-duration=8760h --root-ca-file=/etc/kubernetes/cert/ca.pem --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem --leader-elect=true --feature-gates=RotateKubeletServerCertificate=true --controllers=*,bootstrapsigner,tokencleaner --horizontal-pod-autoscaler-use-rest-clients=true --horizontal-pod-autoscaler-sync-period=10s --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem --use-service-account-credentials=true --alsologtostderr=true --logtostderr=false --log-dir=/var/log/kubernetes --v=2
    
    Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: I1229 03:56:00.395082  126295 flags.go:33] FLAG: --version="false"
    Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: I1229 03:56:00.395093  126295 flags.go:33] FLAG: --vmodule=""
    Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: W1229 03:56:00.819583  126295 authentication.go:296] Cluster doesn't provide requestheader-client-ca-file in configmap/extension-apiserver-authentication in kube-system, so request-header client certificate authentication won't work.
    Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: W1229 03:56:00.820210  126295 authorization.go:146] No authorization-kubeconfig provided, so SubjectAccessReview of authorization tokens won't work.
    Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: I1229 03:56:00.820252  126295 controllermanager.go:151] Version: v1.13.1
    Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: I1229 03:56:00.822080  126295 secure_serving.go:116] Serving securely on 127.0.0.1:10252
    Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: I1229 03:56:00.822954  126295 leaderelection.go:205] attempting to acquire leader lease  kube-system/kube-controller-manager...
    Dec 29 03:57:44 ceph-01 kube-controller-manager[126295]: I1229 03:57:44.753997  126295 log.go:172] http: TLS handshake error from 127.0.0.1:40918: tls: first record does not look like a TLS handshake
    Dec 29 03:57:46 ceph-01 kube-controller-manager[126295]: I1229 03:57:46.558093  126295 log.go:172] http: TLS handshake error from 127.0.0.1:40948: tls: first record does not look like a TLS handshake
    Dec 29 04:08:35 ceph-01 kube-controller-manager[126295]: I1229 04:08:35.872211  126295 log.go:172] http: TLS handshake error from 127.0.0.1:43564: tls: first record does not look like a TLS handshake
    

    二.问题解决

    这里我们推测是kube-controller-manager服务的Service文件的配置问题:

    [root@ceph-01 system]# cat kube-controller-manager.service 
    [Unit]
    Description=Kubernetes Controller Manager
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    
    [Service]
    ExecStart=/usr/local/bin/kube-controller-manager 
      --port=0 
      --secure-port=10252 
      --bind-address=127.0.0.1 
      --kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig 
      --authentication-kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig 
      --service-cluster-ip-range=10.254.0.0/16 
      --cluster-name=kubernetes 
      --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem 
      --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem 
      --experimental-cluster-signing-duration=8760h 
      --root-ca-file=/etc/kubernetes/cert/ca.pem 
      --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem 
      --leader-elect=true 
      --feature-gates=RotateKubeletServerCertificate=true 
      --controllers=*,bootstrapsigner,tokencleaner 
      --horizontal-pod-autoscaler-use-rest-clients=true 
      --horizontal-pod-autoscaler-sync-period=10s 
      --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem 
      --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem 
      --use-service-account-credentials=true 
      --alsologtostderr=true 
      --logtostderr=false 
      --log-dir=/var/log/kubernetes 
      --v=2
    Restart=on-failure
    RestartSec=5
    
    [Install]
    WantedBy=multi-user.target
    

    我们在service文件中加了--port=0--secure-port=10252--bind-address=127.0.0.1
    这三行配置的功能是:

    • --port=0:关闭监听 http /metrics 的请求,同时 --address 参数无效,--bind-address 参数有效
    • --secure-port=10252、--bind-address=0.0.0.0: 在所有网络接口监听 10252 端口的 https /metrics 请求

    这里我们去掉这三行配置:

    [root@ceph-01 system]# cat kube-controller-manager.service 
    [Unit]
    Description=Kubernetes Controller Manager
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    
    [Service]
    ExecStart=/usr/local/bin/kube-controller-manager 
      --kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig 
      --authentication-kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig 
      --service-cluster-ip-range=10.254.0.0/16 
      --cluster-name=kubernetes 
      --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem 
      --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem 
      --experimental-cluster-signing-duration=8760h 
      --root-ca-file=/etc/kubernetes/cert/ca.pem 
      --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem 
      --leader-elect=true 
      --feature-gates=RotateKubeletServerCertificate=true 
      --controllers=*,bootstrapsigner,tokencleaner 
      --horizontal-pod-autoscaler-use-rest-clients=true 
      --horizontal-pod-autoscaler-sync-period=10s 
      --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem 
      --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem 
      --use-service-account-credentials=true 
      --alsologtostderr=true 
      --logtostderr=false 
      --log-dir=/var/log/kubernetes 
      --v=2
    Restart=on-failure
    RestartSec=5
    
    [Install]
    WantedBy=multi-user.target
    

    重启相关服务:

    [root@ceph-01 system]# systemctl daemon-reload
    [root@ceph-01 system]# systemctl restart kube-controller-manager
    

    三.查看集群服务是否正常

    [root@ceph-01 system]# kubectl get cs
    NAME                 STATUS    MESSAGE             ERROR
    controller-manager   Healthy   ok                  
    scheduler            Healthy   ok                  
    etcd-0               Healthy   {"health":"true"}   
    etcd-1               Healthy   {"health":"true"}   
    etcd-2               Healthy   {"health":"true"}   
    
  • 相关阅读:
    做数据库维修工、还是码农,讨论走下神坛的职业【摘自vage】
    4.4 Web存储
    4.3 createjs
    4.2 HTML Canvas标签
    4.2 拖放
    4.1 HTML5 音频
    3.2 JacaScript面向对象
    3.1 JavaScript基础
    2.7 CSS动画
    2.6 CSS基本操作
  • 原文地址:https://www.cnblogs.com/yuhaohao/p/10197315.html
Copyright © 2011-2022 走看看