zoukankan      html  css  js  c++  java
  • 如何做k8s worker节点资源预留?

    资源预留必要性

    以常见的kubeadm安装的k8s集群来说,默认情况下kubelet没有配置kube-reserverd和system-reserverd资源预留。worker node上的pod负载,理论上可以使用该节点服务器上的所有cpu和内存资源。比如某个deployment controller管理的pod存在bug,运行时无法正常释放内存,那么该worker node上的kubelet进程最终会抢占不到足够的内存,无法向kube-apiserver同步心跳状态,该worker node节点的状态进而被标记为NotReady。随后deployment controller会在另外一个worker节点上创建一个pod副本,又重复前述过程,压垮第二个worker node,最终整个k8s集群将面临“雪崩”危险。

    资源分类

    Node capacity:节点总的资源
    kube-reserved:预留给k8s进程的资源(如kubelet, container runtime, node problem detector等)
    system-reserved:预留给操作系统的资源(如sshd、udev等)
    eviction-threshold:kubelet eviction的阀值
    allocatable:留给pod的可用资源=Node capacity  -  kube-reserved  -  system-reserved  -  eviction-threshold

    配置方法

    以ubuntu 16.04+k8s v1.14的环境举例,配置步骤如下。

    1.修改/var/lib/kubelet/config.yaml

    enforceNodeAllocatable:
    - pods
    - kube-reserved
    - system-reserved
    systemReserved:
      cpu: "1"
      memory: "2Gi"
    kubeReserved:
      cpu: "1"
      memory: "2Gi"
    systemReservedCgroup: /system.slice
    kubeReservedCgroup: /system.slice/kubelet.service

     参数解释

      enforce-node-allocatable=pods,kube-reserved,system-reserved   #默认为pod设置,这里要给kube进程和system预留所以要加上。
      kube-reserved-cgroup=/system.slice/kubelet.service   #k8s组件对应的cgroup目录
      system-reserved-cgroup=/system.slice               #系统组件对应的cgroup目录
      kube-reserved=cpu=1,memory=2Gi                  #k8s组件资源预留大小
      system-reserved=cpu=2,memory=4Gi              #系统组件资源预留大小。结合主机配置和系统空载占用资源量的监控,实际测试确定。

      注:根据实际需求,也可以对ephemeral-storage做预留,例如“kube-reserved=cpu=1,memory=2Gi, ephemeral-storage=10Gi"

    2.修改/lib/systemd/system/kubelet.service

       由于cpuset和hugetlb这两个cgroup subsystem默认没有初始化system.slice,需要在启动进程前指定创建。

    [Unit]
    Description=kubelet: The Kubernetes Node Agent
    Documentation=https://kubernetes.io/docs/home/
    
    [Service]
    ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/cpuset/system.slice/kubelet.service
    ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/hugetlb/system.slice/kubelet.service
    ExecStart=/usr/bin/kubelet
    Restart=always
    StartLimitInterval=0
    RestartSec=10
    
    [Install]
    WantedBy=multi-user.target

    3.重启kubelet进程

    systemctl restart kubelet
    systemctl status kubelet

    4.查看worker node的可用资源

      kubectl describe node [Your-NodeName]

    Capacity:
     cpu:                40
     ephemeral-storage:  197608716Ki
     hugepages-1Gi:      0
     hugepages-2Mi:      0
     memory:             131595532Ki
     pods:               110
    Allocatable:
     cpu:                37
     ephemeral-storage:  182116192365
     hugepages-1Gi:      0
     hugepages-2Mi:      0
     memory:             125201676Ki
     pods:               110

    参考文档:

    https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/

    https://my.oschina.net/jxcdwangtao/blog/1629059

  • 相关阅读:
    hihocoder 1049 后序遍历
    hihocoder 1310 岛屿
    Leetcode 63. Unique Paths II
    Leetcode 62. Unique Paths
    Leetcode 70. Climbing Stairs
    poj 3544 Journey with Pigs
    Leetcode 338. Counting Bits
    Leetcode 136. Single Number
    Leetcode 342. Power of Four
    Leetcode 299. Bulls and Cows
  • 原文地址:https://www.cnblogs.com/abcdef/p/11705969.html
Copyright © 2011-2022 走看看