zoukankan      html  css  js  c++  java
  • Kubernetes进阶实战读书笔记:Daemonset控制器|Job控制器

    一、Daemonset控制器

    1、应用场景

    Daemonset是一种特殊的控制器它有特定的应用场景,通常运行那些执行系统级操作任务的应用

    1、运行集群存储的守护进程、如在各个节点上运行glusterd或ceph

    2、在各个节点上运行日志收集守护进程,如fluentd和logstash

    3、在各个节点上运行监控系统的代理守护进程,如Prometheus Node exporter、colletd、Datadog agent、New Relic agent或Ganglia gmond等

    2、Daemonset资源清单

    [root@master chapter5]# cat filebeat-ds.yaml 
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: filebeat-ds
      labels:
        app: filebeat
    spec:
      selector:
        matchLabels:
          app: filebeat
      template:
        metadata:
          labels:
            app: filebeat
          name: filebeat
        spec:
          containers:
          - name: filebeat
            image: ikubernetes/filebeat:5.6.5-alpine
            env:
            - name: REDIS_HOST
              value: db.ikubernetes.io:6379
            - name: LOG_LEVEL
              value: info

    3、创建Daemonset资源对象

    通过清单文件创建Daemonset资源的命令与其他资源的创建并无二致:

    [root@master chapter5]# kubectl apply -f filebeat-ds.yaml 
    daemonset.apps/filebeat-ds created
    

    从kubernetes1.8版本起,Daemonset必须使用Selector来匹配pod模板中指定的标签、而且它也支持matchLabels和matchExpressions两种标签选择器  

    4、验证效果

    查看Daemonset对象的详细信息

    [root@master chapter5]# kubectl describe ds filebeat-ds
    Name:           filebeat-ds
    Selector:       app=filebeat
    Node-Selector:  <none>
    Labels:         app=filebeat
    Annotations:    deprecated.daemonset.template.generation: 1
    Desired Number of Nodes Scheduled: 2
    Current Number of Nodes Scheduled: 2
    Number of Nodes Scheduled with Up-to-date Pods: 2
    Number of Nodes Scheduled with Available Pods: 0
    Number of Nodes Misscheduled: 0
    Pods Status:  0 Running / 2 Waiting / 0 Succeeded / 0 Failed
    Pod Template:
      Labels:  app=filebeat
      Containers:
       filebeat:
        Image:      ikubernetes/filebeat:5.6.5-alpine
        Port:       <none>
        Host Port:  <none>
        Environment:
          REDIS_HOST:  db.ikubernetes.io:6379
          LOG_LEVEL:   info
        Mounts:        <none>
      Volumes:         <none>
    Events:
      Type    Reason            Age   From                  Message
      ----    ------            ----  ----                  -------
      Normal  SuccessfulCreate  45s   daemonset-controller  Created pod: filebeat-ds-b5qxg
      Normal  SuccessfulCreate  45s   daemonset-controller  Created pod: filebeat-ds-5mgz2

    Node-Selector的字段的值为空,表示它需要运行于集群中的每个节点之上。而当前集群的节点数量为2,因此,其期望的POd副本数为2而当前也已经成功创建了3个相关的POd对象

    根据daemonset资源本身的意义,filebeat-ds控制器成功创建的2个POD对象应该分别运行于集群中的每个节点之上,这一点可以通过如下命令进行验证

    [root@master chapter5]# kubectl get po -l app=filebeat  -o custom-columns=NAME:metadata.name,NODE:spec.nodeName
    NAME                NODE
    filebeat-ds-5mgz2   node2
    filebeat-ds-b5qxg   node1
    

    对于拥有特殊硬件的节点来说,可能会需要为其运行特定的监控代理程序,等等、其实现方式与前面讲到的pod资源的节点绑定机制类似,只需要在pod模板的spec字段中嵌套使用nodeSelector字段
    并确保其值定义的标签选择器与部分特定工作节点的标签匹配即可

    5、升级

    更新filebeat-ds中POD模板中的容器镜像为:ikubernetes/filebeat:5.6.6-apline

    [root@master chapter5]# kubectl set image ds filebeat-ds filebeat=ikubernetes/filebeat:5.6.6-apline
    daemonset.apps/filebeat-ds image updated

    查看自动触发更新操作过程:

    [root@master chapter5]# kubectl describe ds filebeat-ds
    Name:           filebeat-ds
    Selector:       app=filebeat
    Node-Selector:  <none>
    Labels:         app=filebeat
    Annotations:    deprecated.daemonset.template.generation: 2
    Desired Number of Nodes Scheduled: 2
    Current Number of Nodes Scheduled: 2
    Number of Nodes Scheduled with Up-to-date Pods: 1
    Number of Nodes Scheduled with Available Pods: 1
    Number of Nodes Misscheduled: 0
    Pods Status:  1 Running / 1 Waiting / 0 Succeeded / 0 Failed
    Pod Template:
      Labels:  app=filebeat
      Containers:
       filebeat:
        Image:      ikubernetes/filebeat:5.6.6-apline
        Port:       <none>
        Host Port:  <none>
        Environment:
          REDIS_HOST:  db.ikubernetes.io:6379
          LOG_LEVEL:   info
        Mounts:        <none>
      Volumes:         <none>
    Events:
      Type    Reason            Age    From                  Message
      ----    ------            ----   ----                  -------
      Normal  SuccessfulCreate  7m57s  daemonset-controller  Created pod: filebeat-ds-b5qxg
      Normal  SuccessfulCreate  7m57s  daemonset-controller  Created pod: filebeat-ds-5mgz2
      Normal  SuccessfulDelete  30s    daemonset-controller  Deleted pod: filebeat-ds-5mgz2
      Normal  SuccessfulCreate  17s    daemonset-controller  Created pod: filebeat-ds-44m95

    由上面的命令结果可以看出、默认的滚动更新策略是一次删除一个工作节点上的pod资源,待其新版本的pod资源重建完成后再开始操作另一个工作节点上的pod资源

    二、Job控制器

    1、应用场景

    单工作队列的串行式Job:即以多个一次性的作业方式串行执行多次作业,直到满足期望的次数

    多工作队列的并行是Job:这种方式可以设置工作队列数,即作业数,每个队列仅负责运行一个作业

    Job控制器常用于管理那些运行一段时间便可"完成" 的任务,例如计算或备份操作

    2、单工作队列的串行式Job

    1、job资源清单

    [root@master chapter5]# cat job-example.yaml 
    apiVersion: batch/v1
    kind: Job
    metadata:
      name: job-example
    spec:
      template:
        metadata:
          labels:
            app: myjob
        spec:
          containers:
          - name: myjob
            image: alpine
            command: ["/bin/sh",  "-c", "sleep 120"]
          restartPolicy: Never

    3、创建Job对象

    [root@master chapter5]# kubectl apply -f job-example.yaml 
    job.batch/job-example created

    4、验证效果

    [root@master chapter5]# kubectl get jobs job-example 
    NAME          COMPLETIONS   DURATION   AGE
    job-example   0/1           52s        53s
    相关的pod资源能够以job控制器名称为标签进行匹配

    [root@master chapter5]# kubectl get pods -l job-name=job-example NAME READY STATUS RESTARTS AGE job-example-z22lq 1/1 Running 0 50s
    其详细信息中可现实所使用的标签选择及匹配的pod资源的标签:具体如下
    [root@master chapter5]# kubectl describe jobs job-example Name: job-example Namespace: default Selector: controller-uid=6525962a-930b-4a7b-8452-48a5828db7c6 Labels: app=myjob controller-uid=6525962a-930b-4a7b-8452-48a5828db7c6 job-name=job-example Annotations: Parallelism: 1 Completions: 1 Start Time: Sat, 08 Aug 2020 09:06:27 +0800 Pods Statuses: 1 Running / 0 Succeeded / 0 Failed Pod Template: Labels: app=myjob controller-uid=6525962a-930b-4a7b-8452-48a5828db7c6 job-name=job-example Containers: myjob: Image: alpine Port: <none> Host Port: <none> Command: /bin/sh -c sleep 120 Environment: <none> Mounts: <none> Volumes: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 71s job-controller Created pod: job-example-z22lq
    两分钟后呆sleep命令执行完成并成功推出后、pod资源即转换为completed状态
    [root@master chapter5]# kubectl get pods -l job-name=job-example NAME READY STATUS RESTARTS AGE job-example-z22lq 0/1 Completed 0 2m32s

    2、多工作队列的并行是Job

    1、job资源清单

    将并行属性job.spec.parallelism的值设置为1,并设置总任务数spec.completions属性便能够让job控制器以穿行方式运行多任务

    [root@master chapter5]# cat job-multi.yaml 
    apiVersion: batch/v1
    kind: Job
    metadata:
      name: job-multi
    spec:
      completions: 5
      template:
        metadata:
          labels:
            app: myjob
        spec:
          containers:
          - name: myjob
            image: alpine
            command: ["/bin/sh",  "-c", "sleep 30"]
          restartPolicy: OnFailure

    3、创建Job对象

    [root@master chapter5]# kubectl apply -f job-multi.yaml 
    job.batch/job-multi created

    4、验证效果

    [root@master chapter5]# kubectl get pods -l job-name=job-multi
    NAME              READY   STATUS              RESTARTS   AGE
    job-multi-8l9h7   0/1     ContainerCreating   0          5s
    
    [root@master chapter5]# kubectl get pods -l job-name=job-multi
    NAME              READY   STATUS    RESTARTS   AGE
    job-multi-8l9h7   1/1     Running   0          19s
    

    5、扩容

    [root@master chapter5]# kubectl scale jobs job-multi --replicas=2
    Error from server (NotFound): the server could not find the requested resource
    
    [root@master chapter5]# kubectl get pods -l job-name=job-multi
    NAME              READY   STATUS    RESTARTS   AGE
    job-multi-8l9h7   1/1     Running   0          29s
    [root@master chapter5]# kubectl get pods -l job-name=job-multi
    NAME              READY   STATUS    RESTARTS   AGE
    job-multi-8l9h7   1/1     Running   0          32s
    [root@master chapter5]# kubectl get pods -l job-name=job-multi
    NAME              READY   STATUS    RESTARTS   AGE
    job-multi-8l9h7   1/1     Running   0          35s

    根据工作结点及其资源可用量、适度提高job的并行度、能够大大提升其完成效率、缩短运行时间
    [root@master chapter5]# kubectl get pods -l job-name=job-multi NAME READY STATUS RESTARTS AGE job-multi-8l9h7 0/1 Completed 0 37s job-multi-wnnnj 0/1 ContainerCreating 0 2s

    2、Job的删除

    如果某job控制器的容器应用总是无法正常结束运行、而其又定义为了重启、则它可能会一直处于不停地重启和错误的循环当中。索性的是job控制器提供了两个用于制止这种情况的发生

    [root@master ~]# kubectl explain job.spec
    KIND:     Job
    VERSION:  batch/v1
    
    RESOURCE: spec <Object>
    
    DESCRIPTION:
         Specification of the desired behavior of a job. More info:
         https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status
    
         JobSpec describes how the job execution will look like.
    
    FIELDS:
       activeDeadlineSeconds	<integer>
       #job的deadline、用于为其制定最大活动时间长度、超出此时长的作业将被终止
         Specifies the duration in seconds relative to the startTime that the job
         may be active before the system tries to terminate it; value must be
         positive integer
     
       backoffLimit	<integer>
       #将作业标记为失败状态之前的重试次数、默认值为6
         Specifies the number of retries before marking this job failed. Defaults to
         6
    

    例如下面的配置片断表示其失败重试的次数为5、并且如果抄书100秒的时间仍未运行未完成、那么将被终止:

      spec:
        backoffLimit: 5
        activeDeadlineSeconds: 100

    三、CronJob控制器

    CronJob控制器用于管理job控制器资源的运行时间。job控制器定义的作业任务在其控制器资源创建之后便会立即执行、但CronJob可以以类似于Linux操作系统周期性任务作业计划的方式控制器运行的时间点及重复运行的方式

    1. 在未来某时间点运行作业一次
    2. 在制定的时间点重复运行作业

    CronJob对象支持使用的时间格式类似于CronJob、略有不同的是CronJob控制在定的时间点时 "?" "*"的意义相同、都表示任何可用的有效值

    1、资源清单

    [root@master chapter5]# cat cronjob-example.yaml 
    apiVersion: batch/v1beta1
    kind: CronJob
    metadata:
      name: cronjob-example
      labels:
        app: mycronjob
    spec:
      schedule: "*/2 * * * *"
      jobTemplate:
        metadata:
          labels:
            app: mycronjob-jobs
        spec:
          parallelism: 2
          template:
            spec:
              containers:
              - name: myjob
                image: alpine
                command:
                - /bin/sh
                - -c
                - date; echo Hello from the Kubernetes cluster; sleep 10
              restartPolicy: OnFailure

    2、创建运行

    [root@master chapter5]# kubectl apply -f cronjob-example.yaml 
    cronjob.batch/cronjob-example created

    3、效果验证

    [root@master chapter5]# kubectl get cronjobs cronjob-example 
    NAME              SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
    cronjob-example   */2 * * * *   False     0        <none>          31s
    
    [root@master chapter5]# kubectl get cronjobs cronjob-example 
    NAME              SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
    cronjob-example   */2 * * * *   False     0        36s             4m1s
    
    [root@master chapter5]# kubectl get cronjobs cronjob-example 
    NAME              SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
    cronjob-example   */2 * * * *   False     0        62s             4m27s
    
    [root@master chapter5]# kubectl get cronjobs cronjob-example 
    NAME              SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
    cronjob-example   */2 * * * *   False     0        68s             4m33s
    
    [root@master chapter5]# kubectl get cronjobs cronjob-example 
    NAME              SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
    cronjob-example   */2 * * * *   False     0        77s             4m42s

    4、控制机制

    CronJob控制器是一个更高级别的资源、它以job控制器资源为其管控对象,并借助它管理pod资源对象

    查看某CronJob控制器创建的job资源对象

    [root@master chapter5]# kubectl get jobs -l app=mycronjob-jobs
    NAME                         COMPLETIONS   DURATION   AGE
    cronjob-example-1596850320   2/1 of 2      24s        4m57s
    cronjob-example-1596850440   2/1 of 2      18s        2m56s
    cronjob-example-1596850560   2/1 of 2      17s        56s

    只有相关的job对象被调度执行时、此命令才能将其正常列出、可列出的job对象的数量取取决于CronJob资源的successfulJobsHistoryLimit属性值,默认为3

    如果作业重复执行指定的时间点较近,而作业执行时长跨过了其两次执行的时间长度、则会出现两个job对象同时存在的情形有些job对象可能会存在无法或不能同时运行的情况,

    甚至属于同一个CronJob的更多job同时运行其他两个可用值中"Forbid" 用于禁止前后两个job同时运行

    如果一个尚未结束、后一个则不予启动、"Replace"用于让后一个job取代前一个、即终止前一个并启动后一个

    5、官方手册详解

    [root@master ~]# kubectl explain Cronjob.spec
    KIND:     CronJob
    VERSION:  batch/v1beta1
    
    RESOURCE: spec <Object>
    
    DESCRIPTION:
         Specification of the desired behavior of a cron job, including the
         schedule. More info:
         https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status
    
         CronJobSpec describes how the job execution will look like and when it will
         actually run.
    
    FIELDS:
       concurrencyPolicy	<string>
       #并发执行策略、可用值有 "Allow"(允许)、"Forbid"(禁止)、"Replace"(替换)用于定义前一次作业运行尚未完成时是否以及如何运行后一次的作业
         Specifies how to treat concurrent executions of a Job. Valid values are: -
         "Allow" (default): allows CronJobs to run concurrently; - "Forbid": forbids
         concurrent runs, skipping next run if previous run hasn't finished yet; -
         "Replace": cancels currently running job and replaces it with a new one
    
       failedJobsHistoryLimit	<integer>
       #为失败的任务执行保留的历史记录、默认为1
         The number of failed finished jobs to retain. This is a pointer to
         distinguish between explicit zero and not specified. Defaults to 1.
    
       jobTemplate	<Object> -required-
       #job控制器模板、用于为CronJob控制器生成job对象;必选字段
         Specifies the job that will be created when executing a CronJob.
    
       schedule	<string> -required-
       #Cron格式的作业调度运行时间点;必选字段
         The schedule in Cron format, see https://en.wikipedia.org/wiki/Cron.
    
       startingDeadlineSeconds	<integer>
       #因各种原因缺乏执行作业的时间点所导致的启动作业错误的超时时长、会被记入错误历史记录
         Optional deadline in seconds for starting the job if it misses scheduled
         time for any reason. Missed jobs executions will be counted as failed ones.
    
       successfulJobsHistoryLimit	<integer>
       #为成功的任务执行保留的历史记录、默认为3
         The number of successful finished jobs to retain. This is a pointer to
         distinguish between explicit zero and not specified. Defaults to 3.
    
       suspend	<boolean>
       #是否挂起后序的任务执行、默认为false、对运行中的迆不会产生影响
         This flag tells the controller to suspend subsequent executions, it does
         not apply to already started executions. Defaults to false.

    四、Pod中断预算

    1、如何保证服务高可用

    尽管deployment或replicaset一类的控制器能够确保相应的pod对象的副本数量不断逼近期望的数量,但它却无法摆正在某一时刻一定会存在制定数量或比例的pod对象、然而这种需要在某些强调服务可用性的场景中却是必备的。

    于是、kubernetes自愿的中断做好预算方案、限制可资源中断的最大pod副本或确保最少可用的pod副本数、以确保服务的高可用性

    pod对象会一直存在、除非有意将其销毁、或者出现了不可避免的硬件或系统软件错误、非资源中断是指那些由不可控外界因素导致的pod中断退出操作

    例如、硬件或系统内核故障、网络故障以及结点资源不足导致pod对象被驱逐等;而那些由用户特地执行的管理操作导致的pod中断则称为"自愿中断"
    例如排空节点、人为删除pod对象、由更新操作出发的pod对象重建等。部署在kubernetes的每个应用程序都可以创建一个对应的PDB对象以限制自愿中断时最大
    可以中断的副本数或者最少应该保持可用的副本数,从而保证赢用自身的高可用性

    2、可以保护哪些控制器

    PDB资源可以用来保护控制器管理的应用、此时几乎必然意味着PDB使用等同于相关控制器对象的标签选择器以精确关联至目标Pod对象、支持的控制器类型包括

    deployment、replicaset和statefulset等。同时PDB对象也可以用来保护那些纯粹是由定制的标签选择器自由选择的pod对象

    3、官方文档详解

    [root@master ~]# kubectl explain pdb.spec
    KIND:     PodDisruptionBudget
    VERSION:  policy/v1beta1
    
    RESOURCE: spec <Object>
    
    DESCRIPTION:
         Specification of the desired behavior of the PodDisruptionBudget.
    
         PodDisruptionBudgetSpec is a description of a PodDisruptionBudget.
    
    FIELDS:
       maxUnavailable	<string>
       #pod自愿中断的场景中,最多可转换为不可用状态的pod对象或比例、0值意味着不允许pod对象进行自愿中断;此字段与maxUnavailable互斥
         An eviction is allowed if at most "maxUnavailable" pods selected by
         "selector" are unavailable after the eviction, i.e. even in absence of the
         evicted pod. For example, one can prevent all voluntary evictions by
         specifying 0. This is a mutually exclusive setting with "minAvailable".
    
       minAvailable	<string>
       #pod自愿中断的场景中、至少要保证可用的pod对象数量或比例、要阻止任何pod对象发生自愿中断、可将其设置为100%
         An eviction is allowed if at least "minAvailable" pods selected by
         "selector" will still be available after the eviction, i.e. even in the
         absence of the evicted pod. So for example you can prevent all voluntary
         evictions by specifying "100%".
    
       selector	<Object>
       #当前PDB对象使用的标签选择器,一般是相关的pod控制器使用同一个选择器
         Label query over pods whose evictions are managed by the disruption budget.

    4、资源清单

    [root@master chapter5]# cat pdb-example.yaml 
    apiVersion: policy/v1beta1
    kind: PodDisruptionBudget
    metadata:
      name: myapp-pdb
    spec:
      minAvailable: 2
      selector:
        matchLabels:
          app: myapp

    5、创建运行

    [root@master chapter5]# kubectl apply -f pdb-example.yaml 
    poddisruptionbudget.policy/myapp-pdb created

    6、效果验证

    [root@master chapter5]# kubectl get pdb -w
    NAME        MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
    myapp-pdb   2               N/A               1                     2m50s
    
    删除第一个pod
    [root@master ~]# kubectl get pod|grep myapp
    myapp-deploy-5cbd66595b-ftllt      1/1     Running     0          95s
    myapp-deploy-5cbd66595b-jssgk      1/1     Running     0          58s
    myapp-deploy-5cbd66595b-vkbsb      1/1     Running     0          109s
    删除第一个pod
    [root@master ~]# kubectl delete pod myapp-deploy-5cbd66595b-ftllt
    pod "myapp-deploy-5cbd66595b-ftllt" deleted
    
    [root@master chapter5]# kubectl get pdb -w
    NAME        MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
    myapp-pdb   2               N/A               1                     2m50s
    myapp-pdb   2               N/A               0                     3m10s
    myapp-pdb   2               N/A               0                     3m10s
    myapp-pdb   2               N/A               1                     3m13s
    myapp-pdb   2               N/A               1                     3m14s
    
    删除第2个pod
    
    [root@master ~]# kubectl delete pod myapp-deploy-5cbd66595b-jssgk
    pod "myapp-deploy-5cbd66595b-jssgk" deleted
    
    [root@master chapter5]# kubectl get pdb -w
    NAME        MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
    myapp-pdb   2               N/A               1                     2m50s
    myapp-pdb   2               N/A               0                     3m10s
    myapp-pdb   2               N/A               0                     3m10s
    myapp-pdb   2               N/A               1                     3m13s
    myapp-pdb   2               N/A               1                     3m14s
    myapp-pdb   2               N/A               0                     5m1s
    myapp-pdb   2               N/A               0                     5m1s
    myapp-pdb   2               N/A               1                     5m3s
    myapp-pdb   2               N/A               1                     5m10s
  • 相关阅读:
    润乾报表之图片导出不显示
    润乾报表之前言
    ActionSheet & alertView
    OC基础知识
    状态栏的设置
    计算机的存储单位
    autoreleass的基本使用
    图片选择器(UIImagePickerController)
    Foundation
    Block
  • 原文地址:https://www.cnblogs.com/luoahong/p/13456478.html
Copyright © 2011-2022 走看看