zoukankan      html  css  js  c++  java
  • k8s调度器、预选策略及调度方式

    一、k8s调度流程

    1、(预选)先排除完全不符合pod运行要求的节点
    2、(优先)根据一系列算法,算出node的得分,最高没有相同的,就直接选择
    3、上一步有相同的话,就随机选一个

    二、调度方式

    1、node(运行在那些node上)
    2、pod选择(当需要运行在某个pod在一个节点上(pod亲和性),或不要pod和某个pod运行在一起(pod反亲和性))
    3、污点 (pod是否能容忍污点,能则能调度到该节点,不能容忍则无法调度到该节点,如果存在则驱离pod),可以定义容忍时间

    三、常用的预选机制

    调度器:
    预选策略:(一部分)
    
    CheckNodeCondition:#检查节点是否正常(如ip,磁盘等)
    GeneralPredicates
    	HostName:#检查Pod对象是否定义了pod.spec.hostname
    	PodFitsHostPorts:#pod要能适配node的端口 pods.spec.containers.ports.hostPort(指定绑定在节点的端口上)
    	MatchNodeSelector:#检查节点的NodeSelector的标签  pods.spec.nodeSelector
    	PodFitsResources:#检查Pod的资源需求是否能被节点所满足
    NoDiskConflict: #检查Pod依赖的存储卷是否能满足需求(默认未使用)
    PodToleratesNodeTaints:#检查Pod上的spec.tolerations可容忍的污点是否完全包含节点上的污点;
    PodToleratesNodeNoExecuteTaints:#不能执行(NoExecute)的污点(默认未使用)
    CheckNodeLabelPresence:#检查指定的标签再上节点是否存在
    CheckServiceAffinity:#将相同services相同的pod尽量放在一起(默认未使用)
    MaxEBSVolumeCount: #检查EBS(AWS存储)存储卷的最大数量
    MaxGCEPDVolumeCount #GCE存储最大数
    MaxAzureDiskVolumeCount: #AzureDisk 存储最大数
    CheckVolumeBinding: #检查节点上已绑定或未绑定的pvc
    NoVolumeZoneConflict: #检查存储卷对象与pod是否存在冲突
    CheckNodeMemoryPressure:#检查节点内存是否存在压力过大
    CheckNodePIDPressure:  #检查节点上的PID数量是否过大
    CheckNodeDiskPressure: #检查内存、磁盘IO是否过大
    MatchInterPodAffinity:  #检查节点是否能满足pod的亲和性或反亲和性
    

      

    四、常用的优选函数

    LeastRequested:#空闲量越高得分越高
    (cpu((capacity-sum(requested))*10/capacity)+memory((capacity-sum(requested))*10/capacity))/2
    BalancedResourceAllocation:#CPU和内存资源被占用率相近的胜出;
    NodePreferAvoidPods:  #节点注解信息“scheduler.alpha.kubernetes.io/preferAvoidPods”
    TaintToleration:#将Pod对象的spec.tolerations列表项与节点的taints列表项进行匹配度检查,匹配条目越,得分越低;
    
    SeletorSpreading:#标签选择器分散度,(与当前pod对象通选的标签,所选其它pod越多的得分越低)
    InterPodAffinity:#遍历pod对象的亲和性匹配项目,项目越多得分越高
    NodeAffinity: #节点亲和性 、
    MostRequested: #空闲量越小得分越高,和LeastRequested相反 (默认未启用)
    NodeLabel:    #节点是否存在对应的标签 (默认未启用)
    ImageLocality:#根据满足当前Pod对象需求的已有镜像的体积大小之和(默认未启用)
    

      

    五、高级调度设置方式

    1、nodeSelector选择器

    #查看标签
    [root@k8s-m ~]# kubectl get  nodes --show-labels
    NAME      STATUS    ROLES     AGE       VERSION   LABELS
    k8s-m     Ready     master    120d      v1.11.2   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/hostname=k8s-m,node-role.kubernetes.io/master=
    node1     Ready     <none>    120d      v1.11.2   app=myapp,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disk=ssd,disktype=ssd,kubernetes.io/hostname=node1,test_node=k8s-node1
    
    #使用nodeSelector选择器,选择disk=ssd的node
    
    
    #查看
    [root@k8s-m schedule]# kubectl  get pod  -o wide
    NAME                     READY     STATUS              RESTARTS   AGE       IP            NODE      NOMINATED NODE
    nginx-pod                1/1       Running             0          49s       10.244.1.92   node1     <none>
    [root@k8s-m schedule]# cat my-pod.yaml 
    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx-pod
      labels: 
        app: my-pod
         
    spec:
      containers:
      - name: my-pod
        image: nginx
        ports:
        - name: http
          containerPort: 80
      nodeSelector:
        disk: ssd
    
    #如果nodeSelector中指定的标签节点都没有,该pod就会处于Pending状态(预选失败)
    

      

    2、affinity

    2.1、nodeAffinity的preferredDuringSchedulingIgnoredDuringExecution (软亲和,选择条件匹配多的,就算都不满足条件,还是会生成pod)

    #使用
    [root@k8s-m schedule]# cat  my-affinity-pod.yaml
    apiVersion: v1
    kind: Pod
    metadata:
      name: affinity-pod
      labels: 
        app: my-pod
         
    spec:
      containers:
      - name: affinity-pod
        image: nginx
        ports:
        - name: http
          containerPort: 80
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
              - key: test_node1 #标签键名
                operator: In #In表示在
                values:
                - k8s-node1 #test_node1标签的值
                - test1     #test_node1标签的值
            weight: 60 #匹配相应nodeSelectorTerm相关联的权重,1-100
    
    ##查看(不存在这个标签,但是还是创建bin运行了)
    [root@k8s-m schedule]# kubectl  get pod  
    NAME                     READY     STATUS              RESTARTS   AGE
    affinity-pod             1/1       Running             0          16s
    

      

    2.2、requiredDuringSchedulingIgnoredDuringExecution (硬亲和,类似nodeSelector,硬性需求,如果不满足条件不会调度pod,都不满足则Pending)

    [root@k8s-m schedule]# cat my-affinity-pod.yaml 
    apiVersion: v1
    kind: Pod
    metadata:
      name: affinity-pod
      labels: 
        app: my-pod
         
    spec:
      containers:
      - name: affinity-pod
        image: nginx
        ports:
        - name: http
          containerPort: 80
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: test_node1 #标签键名
                operator: In #In表示在
                values:
                - k8s-node1 #test_node1标签的值
                - test1     #test_node1标签的值
    
    			
    #查看(没有test_node1这个标签,所以会Pending)
    [root@k8s-m schedule]# kubectl  get pod 
    NAME                     READY     STATUS              RESTARTS   AGE
    affinity-pod             0/1       Pending             0          4s
    

      

    六、pod的亲和与反亲和性

    1、podAffinity:(让pod和某个pod处于同一地方(同一地方不一定指同一node节点,根据个人使用的标签定义))

    #使用(让affinity-pod和my-pod1处于同一处)
    [root@k8s-m schedule]# cat my-affinity-pod2.yaml 
    apiVersion: v1
    kind: Pod
    metadata:
      name: my-pod1
      labels: 
        app1: my-pod1
         
    spec:
      containers:
      - name: my-pod1
        image: nginx
        ports:
        - name: http
          containerPort: 80
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: affinity-pod
      labels: 
        app: my-pod
         
    spec:
      containers:
      - name: affinity-pod
        image: nginx
        ports:
        - name: http
          containerPort: 80
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app1 #标签键名,上面pod定义
                operator: In #In表示在
                values:
                - my-pod1 #app1标签的值
            topologyKey: kubernetes.io/hostname #kubernetes.io/hostname的值一样代表pod处于同一位置     #此pod应位于同一位置(亲和力)或不位于同一位置(反亲和力),与pods匹配指定名称空间中的labelSelector,其中co-located定义为在标签值为的节点上运行,key topologyKey匹配任何选定pod的任何节点在跑
    #查看
    [root@k8s-m schedule]# kubectl  get pod   -o wide
    NAME                     READY     STATUS              RESTARTS   AGE       IP            NODE      NOMINATED NODE
    affinity-pod             1/1       Running             0          54s       10.244.1.98   node1     <none>
    my-pod1                  1/1       Running             0          54s       10.244.1.97   node1     <none>
    

      

    2、podAntiAffinity(让pod和某个pod不处于同一node,和上面相反)

    [root@k8s-m schedule]# cat  my-affinity-pod3.yaml 
    apiVersion: v1
    kind: Pod
    metadata:
      name: my-pod1
      labels: 
        app1: my-pod1
         
    spec:
      containers:
      - name: my-pod1
        image: nginx
        ports:
        - name: http
          containerPort: 80
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: affinity-pod
      labels: 
        app: my-pod
         
    spec:
      containers:
      - name: affinity-pod
        image: nginx
        ports:
        - name: http
          containerPort: 80
      affinity:
        podAntiAffinity:  #就改了这里
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app1 #标签键名,上面pod定义
                operator: In #In表示在
                values:
                - my-pod1 #app1标签的值
            topologyKey: kubernetes.io/hostname #kubernetes.io/hostname的值一样代表pod不处于同一位置   
    
    #查看(我自有一台node,所有是Pending状态)
    [root@k8s-m schedule]# kubectl  get pod 
    NAME                     READY     STATUS              RESTARTS   AGE
    affinity-pod             0/1       Pending             0          1m
    my-pod1                  1/1       Running             0          1m
    

      

    七、污点调度

    taint的effect定义对Pod排斥效果:
    NoSchedule:#仅影响调度过程,对现存的Pod对象不产生影响;
    NoExecute:#既影响调度过程,也影响现在的Pod对象;不容忍的Pod对象将被驱逐;
    PreferNoSchedule: #当没合适地方运行pod了,也会找地方运行pod

    1、查看并管理污点

    #查看node污点(Taints)
    [root@k8s-m schedule]# kubectl  describe  node  k8s-m |grep Taints
    Taints:             node-role.kubernetes.io/master:NoSchedule
    
    [root@k8s-m schedule]# kubectl  describe  node  node1 |grep Taints
    Taints:             <none>
    
    #管理污点taint
    kubectl  taint node  -h
    
    #打污点(给node打标签)
    kubectl  taint    node node1 node-type=PreferNoSchedule:NoSchedule 
    #查看
    [root@k8s-m schedule]# kubectl  describe  node  node1 |grep Taints
    Taints:             node-type=PreferNoSchedule:NoSchedule
    #删除污点
    [root@k8s-m ~]# kubectl taint node node1 node-type-
    node/node1 untainted
    #查看
    [root@k8s-m ~]# kubectl describe node node1 |grep Taints
    aints:             <none>
    

      

    2、使用污点

    #创建pod
    [root@k8s-m ~]# cat  mypod.yaml 
    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx-pod
      labels: 
        app: my-pod
         
    spec:
      containers:
      - name: my-pod
        image: nginx
        ports:
        - name: http
          containerPort: 80
    
    #查看pod(Pinding了)
    [root@k8s-m ~]# kubectl  get pod 
    NAME                     READY     STATUS              RESTARTS   AGE
    nginx-pod                0/1       Pending             0          32s
    
    #不能容忍污点
    [root@k8s-m ~]# kubectl  describe pod nginx-pod|tail  -1
      Warning  FailedScheduling  3s (x22 over 1m)  default-scheduler  0/2 nodes are available: 2 node(s) had taints that the pod didn't tolerate.
    
    
    ###使用
    [root@k8s-m ~]# cat mypod.yaml 
    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx-pod
      labels: 
        app: my-pod
         
    spec:
      containers:
      - name: my-pod
        image: nginx
        ports:
        - name: http
          containerPort: 80
      tolerations: #容忍的污点
      - key: "node-type" #之前定义的污点名
        operator: "Equal" #Exists,如果node-type污点在,就能容忍,Equal精确
        value: "PreferNoSchedule" #污点值
        effect: "NoSchedule" #效果
        #tolerationSeconds: 3600  #如果被驱逐的话,容忍时间,只能是effect为tolerationSeconds或NoExecute定义
    
    	
    #查看(已经调度了)
    [root@k8s-m ~]# kubectl  get pod  -o wide
    NAME                     READY     STATUS              RESTARTS   AGE       IP             NODE      NOMINATED NODE
    nginx-pod                1/1       Running             0          3m        10.244.1.100   node1     <none>
    

      

  • 相关阅读:
    使用 Python 编码和解码 JSON 对象
    搞定github下载加速
    git错误:fatal: Could not read from remote repository.解决
    webstorm安装配置
    node.js下载安装
    IDEA安装小配置
    JAVA软件安装
    关于升级一般软件的一些想法
    linux 的 逻辑卷管理
    记一次内核升级。
  • 原文地址:https://www.cnblogs.com/zhangb8042/p/10203266.html
Copyright © 2011-2022 走看看