zoukankan      html  css  js  c++  java
  • 详细聊聊k8s deployment的滚动更新(二)

    一、知识准备

    ● 本文详细探索deployment在滚动更新时候的行为
    ● 相关的参数介绍:
      livenessProbe:存活性探测。判断pod是否已经停止
      readinessProbe:就绪性探测。判断pod是否能够提供正常服务
      maxSurge:在滚动更新过程中最多可以存在的pod数
      maxUnavailable:在滚动更新过程中最多不可用的pod数


    二、环境准备

    组件 版本
    OS Ubuntu 18.04.1 LTS
    docker 18.06.0-ce

    三、准备镜像、yaml文件

    首先准备2个不同版本的镜像,用于测试(已经在阿里云上创建好2个不同版本的nginx镜像)

    docker pull registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:v1
    docker pull registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:delay_v1
    

    2个镜像都提供相同的服务,只不过nginx:delay_v1会延迟启动20才启动nginx

    root@k8s-master:~# docker run -d --rm -p 10080:80 nginx:v1
    e88097841c5feef92e4285a2448b943934ade5d86412946bc8d86e262f80a050
    root@k8s-master:~# curl http://127.0.0.1:10080
    ----------
    version: v1
    hostname: f5189a5d3ad3
    

    yaml文件:

    root@k8s-master:~# more roll_update.yaml
    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: update-deployment
    spec:
      replicas: 3
      template:
        metadata:
          labels:
            app: roll-update
        spec:
          containers:
          - name: nginx
            image: registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:v1
            imagePullPolicy: Always
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx-service
    spec:
        selector:
          app: roll-update
        ports:
        - protocol: TCP
          port: 10080
          targetPort: 80
    

    四、livenessProbe与readinessProbe

    livenessProbe:存活性探测,最主要是用来探测pod是否需要重启
    readinessProbe:就绪性探测,用来探测pod是否已经能够提供服务

    ● 在滚动更新的过程中,pod会动态的被delete,然后又被create出来。存活性探测保证了始终有足够的pod存活提供服务,一旦出现pod数量不足,k8s会立即拉起新的pod
    ● 但是在pod启动的过程中,服务正在打开,并不可用,这时候如果有流量打过来,就会造成报错

    下面来模拟一下这个场景:

    首先apply上述的配置文件

    root@k8s-master:~# kubectl apply -f roll_update.yaml
    deployment.extensions "update-deployment" created
    service "nginx-service" created
    root@k8s-master:~# kubectl get pod -owide
    NAME                                 READY     STATUS    RESTARTS   AGE       IP              NODE
    update-deployment-7db77f7cc6-c4s2v   1/1       Running   0          28s       10.10.235.232   k8s-master
    update-deployment-7db77f7cc6-nfgtd   1/1       Running   0          28s       10.10.36.82     k8s-node1
    update-deployment-7db77f7cc6-tflfl   1/1       Running   0          28s       10.10.169.158   k8s-node2
    root@k8s-master:~# kubectl get svc
    NAME            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
    nginx-service   ClusterIP   10.254.254.199   <none>        10080/TCP   1m
    

    重新打开终端,测试当前服务的可用性(每秒做一次循环去获取nginx的服务内容):

    root@k8s-master:~# while :; do curl http://10.254.254.199:10080; sleep 1; done
    ----------
    version: v1
    hostname: update-deployment-7db77f7cc6-nfgtd
    ----------
    version: v1
    hostname: update-deployment-7db77f7cc6-c4s2v
    ----------
    version: v1
    hostname: update-deployment-7db77f7cc6-tflfl
    ----------
    version: v1
    hostname: update-deployment-7db77f7cc6-nfgtd
    ...
    

    这时候把镜像版本更新到nginx:delay_v1,这个镜像会延迟启动nginx,也就是说,会先sleep 20s,然后才去启动nginx服务。这就模拟了在服务启动过程中,虽然pod已经是存在的状态,但是并没有真正提供服务

    root@k8s-master:~# kubectl patch deployment update-deployment --patch '{"metadata":{"annotations":{"kubernetes.io/change-cause":"update version to v2"}} ,"spec": {"template": {"spec": {"containers": [{"name": "nginx","image":"registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:delay_v1"}]}}}}'
    deployment.extensions "update-deployment" patched
    
    ...
    ----------
    version: v1
    hostname: update-deployment-7db77f7cc6-h6hvt
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused
    ----------
    version: delay_v1
    hostname: update-deployment-d788c7dc6-6th87
    ----------
    version: delay_v1
    hostname: update-deployment-d788c7dc6-n22vz
    ----------
    version: delay_v1
    hostname: update-deployment-d788c7dc6-njmpz
    ----------
    version: delay_v1
    hostname: update-deployment-d788c7dc6-6th87
    

    可以看到,由于延迟启动,nginx并没有真正做好准备提供服务,此时流量已经发到后端,导致服务不可用的状态

    所以,加入readinessProbe是非常必要的手段:

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: update-deployment
    spec:
      replicas: 3
      template:
        metadata:
          labels:
            app: roll-update
        spec:
          containers:
          - name: nginx
            image: registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:v1
            imagePullPolicy: Always
            readinessProbe:
              tcpSocket:
                port: 80
              initialDelaySeconds: 5
              periodSeconds: 10
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx-service
    spec:
        selector:
          app: roll-update
        ports:
        - protocol: TCP
          port: 10080
          targetPort: 80
    

    重复上述步骤,先创建nginx:v1,然后patch到nginx:delay_v1

    root@k8s-master:~# kubectl apply -f roll_update.yaml
    deployment.extensions "update-deployment" created
    service "nginx-service" created
    root@k8s-master:~# kubectl patch deployment update-deployment --patch '{"metadata":{"annotations":{"kubernetes.io/change-cause":"update version to v2"}} ,"spec": {"template": {"spec": {"containers": [{"name": "nginx","image":"registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:delay_v1"}]}}}}'
    deployment.extensions "update-deployment" patched
    
    root@k8s-master:~# kubectl get pod -owide
    NAME                                 READY     STATUS        RESTARTS   AGE       IP              NODE
    busybox                              1/1       Running       0          45d       10.10.235.255   k8s-master
    lifecycle-demo                       1/1       Running       0          32d       10.10.169.186   k8s-node2
    private-reg                          1/1       Running       0          92d       10.10.235.209   k8s-master
    update-deployment-54d497b7dc-4mlqc   0/1       Running       0          13s       10.10.169.178   k8s-node2
    update-deployment-54d497b7dc-pk4tb   0/1       Running       0          13s       10.10.36.98     k8s-node1
    update-deployment-6d5d7c9947-l7dkb   1/1       Terminating   0          1m        10.10.169.177   k8s-node2
    update-deployment-6d5d7c9947-pbzmf   1/1       Running       0          1m        10.10.36.97     k8s-node1
    update-deployment-6d5d7c9947-zwt4z   1/1       Running       0          1m        10.10.235.246   k8s-master
    

    ● 由于设置了readinessProbe,虽然pod已经启动起来了,但是并不会立即投入使用,所以出现了 READY: 0/1 的情况
    ● 并且有pod出现了一直持续Terminating状态,因为滚动更新的限制,至少要保证有pod可用

    再查看curl的状态,image的版本平滑更新到了nginx:delay_v1,没有出现报错的状况

    root@k8s-master:~# while :; do curl http://10.254.66.136:10080; sleep 1; done
    ...
    version: v1
    hostname: update-deployment-6d5d7c9947-pbzmf
    ----------
    version: v1
    hostname: update-deployment-6d5d7c9947-zwt4z
    ----------
    version: v1
    hostname: update-deployment-6d5d7c9947-pbzmf
    ----------
    version: v1
    hostname: update-deployment-6d5d7c9947-zwt4z
    ----------
    version: delay_v1
    hostname: update-deployment-54d497b7dc-pk4tb
    ----------
    version: delay_v1
    hostname: update-deployment-54d497b7dc-4mlqc
    ----------
    version: delay_v1
    hostname: update-deployment-54d497b7dc-pk4tb
    ----------
    version: delay_v1
    hostname: update-deployment-54d497b7dc-4mlqc
    ...
    

    五、maxSurge与maxUnavailable

    ● 在滚动更新中,有几种更新方案:先删除老的pod,然后添加新的pod;先添加新的pod,然后删除老的pod。在这个过程中,服务必须是可用的(也就是livenessProbe与readiness必须检测通过)
    ● 在具体的实施中,由maxSurge与maxUnavailable来控制究竟是先删老的还是先加新的以及粒度
    ● 若指定的副本数为3:
      maxSurge=1 maxUnavailable=0:最多允许存在4个(3+1)pod,必须有3个pod(3-0)同时提供服务。先创建一个新的pod,可用之后删除老的pod,直至全部更新完毕
      maxSurge=0 maxUnavailable=1:最多允许存在3个(3+0)pod,必须有2个pod(3-1)同时提供服务。先删除一个老的pod,然后创建新的pod,直至全部更新完毕
    ● 归根结底,必须满足maxSurge与maxUnavailable的条件,如果maxSurge与maxUnavailable同时为0,那就没法更新了,因为又不让删除,也不让添加,这种条件是无法满足的

    六、小结

    ● 本文介绍了deployment滚动更新过程中,maxSurge、maxUnavailable、liveness、readiness等参数的使用
    ● 在滚动更新过程中,还有留有一个问题。比如在一个大型的系统中,某个业务的pod数很多(100个),执行一次滚动更新时,势必会造成pod版本不一致(有些pod是老版本,有些pod是新版本),用户访问很有可能会造成多次结果不一致的现象,直至版本更新完毕。关于这个问题待之后慢慢讨论



    至此,本文结束
    在下才疏学浅,有撒汤漏水的,请各位不吝赐教...

  • 相关阅读:
    Java面向对象
    JBCD技术
    初识数据库(其他数据库对象)
    初识数据库(TCL语句)
    初识数据库(分组函数)
    初识数据库(函数)
    初识数据库(数据类型)
    Java中的IO流
    Java中的线程
    Java中的集合
  • 原文地址:https://www.cnblogs.com/MrVolleyball/p/10360860.html
Copyright © 2011-2022 走看看