zoukankan      html  css  js  c++  java
  • 基于Centos 7.8 和Kubeadm部署k8s高可用集群

    原文作者:Zhangguanzhang

    原文链接:http://zhangguanzhang.github.io/2019/11/24/kubeadm-base-use/

    一:系统基础配置

    这里我们认为您的系统是最新且最小化安装的。

    1. 确保时间统一

    yum install chrony -y systemctl enable chronyd && systemctl restart chronyd

    2:关闭交换分区
    swapoff -a && sysctl -w vm.swappiness=0
    sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab

    3:关闭防火墙以及selinux
    systemctl stop firewalld && systemctl disable firewalld
    setenforce 0
    sed -ri '/^[^#]*SELINUX=/s#=.+$#=disabled#' /etc/selinux/config

    4. 关闭NetworkManager,如果ip不是通过NetworkManager纳管的,建议关闭,然后使用network;这里我们依然使用的是network
    systemctl disable NetworkManager && systemctl stop NetworkManager
    systemctl restart network

    5. 安装epel源,并且替换为阿里云的epel源
    yum install epel-release wget -y
    wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo

    6. 安装依赖组件
    yum install -y 
    curl
    git
    conntrack-tools
    psmisc
    nfs-utils
    jq
    socat
    bash-completion
    ipset
    ipvsadm
    conntrack
    libseccomp
    net-tools
    crontabs
    sysstat
    unzip
    iftop
    nload
    strace
    bind-utils
    tcpdump
    telnet
    lsof
    htop
     

     二:集群kube-proxy使用ipvs模式需要开机加载下列模块

    这里按照规范使用systemd-modules-load来加载而不是在/etc/rc.local里写modprobe

    vim /etc/modules-load.d/ipvs.conf
    
    ip_vs
    ip_vs_rr
    ip_vs_wrr
    ip_vs_sh
    nf_conntrack
    br_netfilter

    systemctl daemon-reload && systemctl enable --now systemd-modules-load.service

    确认内核加载模块

    [root@k8s-m1 ~]# lsmod | grep ip_v
    ip_vs_sh               12688  0 
    ip_vs_wrr              12697  0 
    ip_vs_rr               12600  0 
    ip_vs                 145497  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
    nf_conntrack          139264  1 ip_vs
    libcrc32c              12644  3 xfs,ip_vs,nf_conntrack

    三: 设定系统参数

    所有机器需要设定/etc/sysctl.d/k8s.conf的系统参数,目前对ipv6支持不怎么好,所以里面也关闭ipv6了。

    cat <<EOF > /etc/sysctl.d/k8s.conf
    net.ipv6.conf.all.disable_ipv6 = 1
    net.ipv6.conf.default.disable_ipv6 = 1
    net.ipv6.conf.lo.disable_ipv6 = 1
    net.ipv4.neigh.default.gc_stale_time = 120
    net.ipv4.conf.all.rp_filter = 0
    net.ipv4.conf.default.rp_filter = 0
    net.ipv4.conf.default.arp_announce = 2
    net.ipv4.conf.lo.arp_announce = 2
    net.ipv4.conf.all.arp_announce = 2
    net.ipv4.ip_forward = 1
    net.ipv4.tcp_max_tw_buckets = 5000
    net.ipv4.tcp_syncookies = 1
    net.ipv4.tcp_max_syn_backlog = 1024
    net.ipv4.tcp_synack_retries = 2
    # 要求iptables不对bridge的数据进行处理
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    net.bridge.bridge-nf-call-arptables = 1
    net.netfilter.nf_conntrack_max = 2310720
    fs.inotify.max_user_watches=89100
    fs.may_detach_mounts = 1
    fs.file-max = 52706963
    fs.nr_open = 52706963
    vm.overcommit_memory=1
    vm.panic_on_oom=0
    EOF

    如果kube-proxy使用ipvs的话为了防止timeout需要设置下tcp参数

    cat <<EOF >> /etc/sysctl.d/k8s.conf
    # https://github.com/moby/moby/issues/31208 
    # ipvsadm -l --timout
    # 修复ipvs模式下长连接timeout问题 小于900即可
    net.ipv4.tcp_keepalive_time = 600
    net.ipv4.tcp_keepalive_intvl = 30
    net.ipv4.tcp_keepalive_probes = 10
    EOF
    sysctl --system

    优化设置 journal 日志相关,避免日志重复搜集,浪费系统资源。修改systemctl启动的最小文件打开数量。关闭ssh反向dns解析

    # 下面两句apt系列系统没有,执行不影响
    sed -ri 's/^$ModLoad imjournal/#&/' /etc/rsyslog.conf
    sed -ri 's/^$IMJournalStateFile/#&/' /etc/rsyslog.conf
    
    sed -ri 's/^#(DefaultLimitCORE)=/1=100000/' /etc/systemd/system.conf
    sed -ri 's/^#(DefaultLimitNOFILE)=/1=100000/' /etc/systemd/system.conf
    
    sed -ri 's/^#(UseDNS )yes/1no/' /etc/ssh/sshd_config

    文件最大打开数,按照规范,在子配置文件写

    cat>/etc/security/limits.d/kubernetes.conf<<EOF
    *       soft    nproc   131072
    *       hard    nproc   131072
    *       soft    nofile  131072
    *       hard    nofile  131072
    root    soft    nproc   131072
    root    hard    nproc   131072
    root    soft    nofile  131072
    root    hard    nofile  131072
    EOF

    docker官方的内核检查脚本建议(RHEL7/CentOS7: User namespaces disabled; add 'user_namespace.enable=1' to boot command line),如果是yum系列的系统使用下面命令开启,

    grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"

     四: 安装docker

    检查系统内核和模块是否适合运行 docker (仅适用于 linux 系统),该脚本可能因为墙的原因无法生成,可以先去掉重定向看看能不能访问到脚本

    curl -s https://raw.githubusercontent.com/docker/docker/master/contrib/check-config.sh > check-config.sh
    bash ./check-config.sh

    现在docker存储驱动都是使用的overlay2(不要使用devicemapper,这个坑非常多),我们重点关注overlay2是否不是绿色

    这里我们使用年份命名版本的docker-ce,假设我们要安装v1.18.5的k8s,我们去https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG

    里进对应版本的CHANGELOG-1.18.md里搜The list of validated docker versions remain查找官方验证过的docker版本,docker版本不一定得在列表里,实际上测试过19.03也能使用(19.03+修复了runc的一个性能bug),这里我们使用docker官方的安装脚本安装docker(该脚本支持centos和ubuntu).

    export VERSION=19.03
    curl -fsSL "https://get.docker.com/" | bash -s -- --mirror Aliyun

    所有机器配置加速源并配置docker的启动参数使用systemd,使用systemd是官方的建议,详见 https://kubernetes.io/docs/setup/cri/

    mkdir -p /etc/docker/
    cat>/etc/docker/daemon.json<<EOF
    {
      "exec-opts": ["native.cgroupdriver=systemd"],
      "bip": "169.254.123.1/24",
      "oom-score-adjust": -1000,
      "registry-mirrors": [
          "https://fz5yth0r.mirror.aliyuncs.com",
          "https://dockerhub.mirrors.nwafu.edu.cn/",
          "https://mirror.ccs.tencentyun.com",
          "https://docker.mirrors.ustc.edu.cn/",
          "https://reg-mirror.qiniu.com",
          "http://hub-mirror.c.163.com/",
          "https://registry.docker-cn.com"
      ],
      "storage-driver": "overlay2",
      "storage-opts": [
        "overlay2.override_kernel_check=true"
      ],
      "log-driver": "json-file",
      "log-opts": {
        "max-size": "100m",
        "max-file": "3"
      }
    }
    EOF

    Live Restore Enabled这个千万别开,某些极端情况下容器Dead状态之类的必须重启docker daemon才能解决,开了就只能重启机器解决了

    复制补全脚本

    cp /usr/share/bash-completion/completions/docker /etc/bash_completion.d/

    启动docker并看下信息是否正常

    systemctl enable --now docker
    docker info

    五:kube-nginx部署

    这里我们使用nginx实现local proxy来玩,因为localproxy是每台机器上的,可以不用SLB和无视在云上vpc里无法使用vip的限制,需要每个机器上运行nginx实现
    每台机器配置hosts

    [root@k8s-m1 src]# cat /etc/hosts
    127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
    127.0.0.1 apiserver.k8s.local
    192.168.50.101 apiserver01.k8s.local
    192.168.50.102 apiserver02.k8s.local
    192.168.50.103 apiserver03.k8s.local
    192.168.50.101 k8s-m1
    192.168.50.102 k8s-m2
    192.168.50.103 k8s-m3
    192.168.50.104 k8s-node1
    192.168.50.105 k8s-node2
    192.168.50.106 k8s-node3

    每台机器生成nginx配置文件,上面的三个hosts可以不写,写下面配置文件里域名写ip即可,但是这样更改ip需要重新加载。这里我跟原作者不一样的,是我自己手动编译nginx来做的。

    mkdir -p /etc/kubernetes
    [root@k8s-m1 src]# cat /etc/kubernetes/nginx.conf 
    user nginx nginx;
    worker_processes auto;
    events {
        worker_connections  20240;
        use epoll;
    }
    error_log /var/log/kube_nginx_error.log info;
    
    stream {
        upstream kube-servers {
            hash  consistent;
            server apiserver01.k8s.local:6443 weight=5 max_fails=1 fail_timeout=3s;
            server apiserver02.k8s.local:6443 weight=5 max_fails=1 fail_timeout=3s;
            server apiserver03.k8s.local:6443 weight=5 max_fails=1 fail_timeout=3s;
        }
    
        server {
            listen 8443 reuseport;
            proxy_connect_timeout 3s;
            # 加大timeout
            proxy_timeout 3000s;
            proxy_pass kube-servers;
        }
    }

    因为localproxy是每台机器上的,可以不用SLB和vpc无法使用vip的限制,这里我们编译安装kube-nginx;所有机器都需要安装

    yum install gcc gcc-c++ -y
    groupadd nginx
    useradd -r -g nginx nginx
    wget http://nginx.org/download/nginx-1.16.1.tar.gz -P /usr/local/src/
    cd /usr/local/src/
    tar zxvf nginx-1.16.1.tar.gz
    cd nginx-1.16.1/
    ./configure --with-stream --without-http --prefix=/usr/local/kube-nginx --without-http_uwsgi_module --without-http_scgi_module --without-http_fastcgi_module
    make && make install
    
    #编写systemd启动
    [root@k8s-m1 src]# cat /usr/lib/systemd/system/kube-nginx.service 
    [Unit]
    Description=kube-apiserver nginx proxy
    After=network.target
    After=network-online.target
    Wants=network-online.target
    
    [Service]
    Type=forking
    ExecStartPre=/usr/local/kube-nginx/sbin/nginx -c /etc/kubernetes/nginx.conf -p /usr/local/kube-nginx -t
    ExecStart=/usr/local/kube-nginx/sbin/nginx -c /etc/kubernetes/nginx.conf -p /usr/local/kube-nginx
    ExecReload=/usr/local/kube-nginx/sbin/nginx -c /etc/kubernetes/nginx.conf -p /usr/local/kube-nginx -s reload
    PrivateTmp=true
    Restart=always
    RestartSec=5
    StartLimitInterval=0
    LimitNOFILE=65536
    
    [Install]
    WantedBy=multi-user.target
    
    systemctl daemon-reload && systemctl enable kube-nginx && systemctl restart kube-nginx

    六: kubeadm部署

    1. 配置kubernetes阿里云的源

    cat <<EOF > /etc/yum.repos.d/kubernetes.repo
    [kubernetes]
    name=Kubernetes
    baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
    enabled=1
    gpgcheck=1
    repo_gpgcheck=1
    gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
    EOF

    2. master部分

    k8s的node就是kubelet+cri(一般是docker),kubectl是一个agent读取kubeconfig去访问kube-apiserver来操作集群,kubeadm是部署,所以master节点需要安装三个,node一般不需要kubectl

    安装相关软件

    yum install -y 
        kubeadm-1.18.5 
        kubectl-1.18.5 
        kubelet-1.18.5 
        --disableexcludes=kubernetes && 
        systemctl enable kubelet

    node节点安装软件

     yum install -y 
        kubeadm-1.18.5 
        kubelet-1.18.5 
        --disableexcludes=kubernetes && 
        systemctl enable kubelet

    配置集群信息(第一个master上配置)

    打印默认init的配置信息

    kubeadm config print init-defaults > initconfig.yaml
    
    #我们看下默认init的集群参数
    
    apiVersion: kubeadm.k8s.io/v1beta2
    bootstrapTokens:
    - groups:
      - system:bootstrappers:kubeadm:default-node-token
      token: abcdef.0123456789abcdef
      ttl: 24h0m0s
      usages:
      - signing
      - authentication
    kind: InitConfiguration
    localAPIEndpoint:
      advertiseAddress: 1.2.3.4
      bindPort: 6443
    nodeRegistration:
      criSocket: /var/run/dockershim.sock
      name: k8s-m1
      taints:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
    ---
    apiServer:
      timeoutForControlPlane: 4m0s
    apiVersion: kubeadm.k8s.io/v1beta2
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controllerManager: {}
    dns:
      type: CoreDNS
    etcd:
      local:
        dataDir: /var/lib/etcd
    imageRepository: k8s.gcr.io
    kind: ClusterConfiguration
    kubernetesVersion: v1.16.0
    networking:
      dnsDomain: cluster.local
      serviceSubnet: 10.96.0.0/12
    scheduler: {}

    我们主要关注和只保留ClusterConfiguration的段,然后修改下,可以参考下列的v1beta2文档,如果是低版本可能是v1beta1,某些字段和新的是不一样的,自行查找godoc看
    https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#hdr-Basics
    https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2
    https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#pkg-constants
    https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#ClusterConfiguration
    ip啥的自行更改成和自己的一致,cidr不懂咋计算就别乱改。controlPlaneEndpoint写域名(内网没dns所有机器写hosts也行)或者SLB,VIP,原因和注意事项见 https://zhangguanzhang.github.io/2019/03/11/k8s-ha/ 这个文章我把HA解释得很清楚了,不要再问我了,下面是最终的yaml

    apiVersion: kubeadm.k8s.io/v1beta2
    kind: ClusterConfiguration
    imageRepository: registry.aliyuncs.com/k8sxio
    kubernetesVersion: v1.18.5 # 如果镜像列出的版本不对就这里写正确版本号
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    networking: #https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Networking
      dnsDomain: cluster.local
      serviceSubnet: 10.96.0.0/12
      podSubnet: 10.244.0.0/16
    controlPlaneEndpoint: apiserver.k8s.local:8443 # 单个master的话写master的ip或者不写
    apiServer: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#APIServer
      timeoutForControlPlane: 4m0s
      extraArgs:
        authorization-mode: "Node,RBAC"
        enable-admission-plugins: "NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeClaimResize,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,Priority,PodPreset"
        runtime-config: api/all=true,settings.k8s.io/v1alpha1=true
        storage-backend: etcd3
        etcd-servers: https://192.168.50.101:2379,https://192.168.50.102:2379,https://192.168.50.103:2379
      certSANs:
      - 10.96.0.1 # service cidr的第一个ip
      - 127.0.0.1 # 多个master的时候负载均衡出问题了能够快速使用localhost调试
      - localhost
      - apiserver.k8s.local # 负载均衡的域名或者vip
      - 192.168.50.101
      - 192.168.50.102
      - 192.168.50.103
      - apiserver01.k8s.local
      - apiserver02.k8s.local
      - apiserver03.k8s.local
      - master
      - kubernetes
      - kubernetes.default 
      - kubernetes.default.svc 
      - kubernetes.default.svc.cluster.local
      extraVolumes:
      - hostPath: /etc/localtime
        mountPath: /etc/localtime
        name: localtime
        readOnly: true
    controllerManager: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#ControlPlaneComponent
      extraArgs:
        bind-address: "0.0.0.0"
        experimental-cluster-signing-duration: 867000h
      extraVolumes:
      - hostPath: /etc/localtime
        mountPath: /etc/localtime
        name: localtime
        readOnly: true
    scheduler: 
      extraArgs:
        bind-address: "0.0.0.0"
      extraVolumes:
      - hostPath: /etc/localtime
        mountPath: /etc/localtime
        name: localtime
        readOnly: true
    dns: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#DNS
      type: CoreDNS # or kube-dns
      imageRepository: coredns # azk8s.cn已失效,使用dockerhub上coredns官方镜像
      imageTag: 1.6.7  # 阿里镜像仓库目前只有1.6.7,最新见dockerhub
    etcd: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Etcd
      local:
        imageRepository: quay.io/coreos
        imageTag: v3.4.7
        dataDir: /var/lib/etcd
        serverCertSANs: # server和peer的localhost,127,::1都默认自带的不需要写
        - master
        - 192.168.50.101
        - 192.168.50.102
        - 192.168.50.103
        - etcd01.k8s.local
        - etcd02.k8s.local
        - etcd03.k8s.local
        peerCertSANs:
        - master
        - 192.168.50.101
        - 192.168.50.102
        - 192.168.50.103
        - etcd01.k8s.local
        - etcd02.k8s.local
        - etcd03.k8s.local
        extraArgs: # 暂时没有extraVolumes
          auto-compaction-retention: "1h"
          max-request-bytes: "33554432"
          quota-backend-bytes: "8589934592"
          enable-v2: "false" # disable etcd v2 api
      # external: //外部etcd的时候这样配置 https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Etcd
        # endpoints:
        # - "https://172.19.0.2:2379"
        # - "https://172.19.0.3:2379"
        # - "https://172.19.0.4:2379"
        # caFile: "/etc/kubernetes/pki/etcd/ca.crt"
        # certFile: "/etc/kubernetes/pki/etcd/etcd.crt"
        # keyFile: "/etc/kubernetes/pki/etcd/etcd.key"
    ---
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    kind: KubeProxyConfiguration # https://godoc.org/k8s.io/kube-proxy/config/v1alpha1#KubeProxyConfiguration
    mode: ipvs # or iptables
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: "rr" # 调度算法
      syncPeriod: 15s
    iptables:
      masqueradeAll: true
      masqueradeBit: 14
      minSyncPeriod: 0s
      syncPeriod: 30s
    ---
    apiVersion: kubelet.config.k8s.io/v1beta1
    kind: KubeletConfiguration # https://godoc.org/k8s.io/kubelet/config/v1beta1#KubeletConfiguration
    cgroupDriver: systemd
    failSwapOn: true # 如果开启swap则设置为false

    检查文件是否错误,忽略warning,错误的话会抛出error,没错则会输出到包含字符串kubeadm join xxx啥的

    kubeadm init --config initconfig.yaml --dry-run

    检查镜像是否正确,版本号不正确就把yaml里的kubernetesVersion取消注释写上自己对应的版本号

    kubeadm config images list --config initconfig.yaml

    预先拉取镜像

    kubeadm config images pull --config initconfig.yaml # 下面是输出
    [config/images] Pulled gcr.azk8s.cn/google_containers/kube-apiserver:v1.18.5
    [config/images] Pulled gcr.azk8s.cn/google_containers/kube-controller-manager:v1.18.5
    [config/images] Pulled gcr.azk8s.cn/google_containers/kube-scheduler:v1.18.5
    [config/images] Pulled gcr.azk8s.cn/google_containers/kube-proxy:v1.18.5
    [config/images] Pulled gcr.azk8s.cn/google_containers/pause:3.1
    [config/images] Pulled quay.azk8s.cn/coreos/etcd:v3.4.7
    [config/images] Pulled coredns/coredns:1.6.3

    七:kubeadm init

    下面init只在第一个master上面操作

    # --experimental-upload-certs 参数的意思为将相关的证书直接上传到etcd中保存,这样省去我们手动分发证书的过程
    # 注意在v1.15+版本中,已经变成正式参数,不再是实验性质,之前的版本请使用 --experimental-upload-certs
    
    kubeadm init --config initconfig.yaml --upload-certs

    如果超时了看看是不是kubelet没起来,调试见 https://github.com/zhangguanzhang/Kubernetes-ansible/wiki/systemctl-running-debug

    记住init后打印的token,复制kubectl的kubeconfig,kubectl的kubeconfig路径默认是~/.kube/config

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

    init的yaml信息实际上会存在集群的configmap里,我们可以随时查看,该yaml在其他node和master join的时候会使用到

    kubectl -n kube-system get cm kubeadm-config -o yaml

    如果单个master,也不想整其他的node,需要去掉master节点上的污点,下一步的多master操作不需要整

    kubectl taint nodes --all node-role.kubernetes.io/master-

    设置ep的rbac

    kube-apiserver的web健康检查路由有权限,我们需要开放用来监控或者对接SLB的健康检查,yaml文件 https://github.com/zhangguanzhang/Kubernetes-ansible-base/blob/roles/master/files/healthz-rbac.yml

    kubectl apply -f https://raw.githubusercontent.com/zhangguanzhang/Kubernetes-ansible-base/roles/master/files/healthz-rbac.yml

    配置其他master的k8s管理组件

    手动拷贝(某些低版本不支持上传证书的时候操作,如果前面kubeadm init的时候加了上传证书选项这步不用执行)

    第一个master上拷贝ca证书到其他master节点上,因为交互输入密码,我们安装sshpass,zhangguanzhang是root密码

    yum install sshpass -y
    alias ssh='sshpass -p zhangguanzhang ssh -o StrictHostKeyChecking=no'
    alias scp='sshpass -p zhangguanzhang scp -o StrictHostKeyChecking=no'

    复制ca证书到其他master节点

    for node in 172.19.0.3 172.19.0.4;do
        ssh $node 'mkdir -p /etc/kubernetes/pki/etcd'
        scp -r /etc/kubernetes/pki/ca.* $node:/etc/kubernetes/pki/
        scp -r /etc/kubernetes/pki/sa.* $node:/etc/kubernetes/pki/
        scp -r /etc/kubernetes/pki/front-proxy-ca.* $node:/etc/kubernetes/pki/
        scp -r /etc/kubernetes/pki/etcd/ca.* $node:/etc/kubernetes/pki/etcd/
    done
    其他master join进来
    kubeadm join apiserver.k8s.local:8443 --token vo6qyo.4cm47w561q9p830v 
        --discovery-token-ca-cert-hash sha256:46e177c317037a4815c6deaab8089da4340663efeeead40810d4f53239256671 
        --control-plane --certificate-key ba869da2d611e5afba5f9959a5f18891c20fb56d90592225765c0b965e3d8783

    token忘记的话可以kubeadm token list查看,可以通过kubeadm token create创建
    sha256的值可以通过下列命令获取

    openssl x509 -pubkey -in 
        /etc/kubernetes/pki/ca.crt | 
        openssl rsa -pubin -outform der 2>/dev/null | 
        openssl dgst -sha256 -hex | sed 's/^.* //'

    设置kubectl的补全脚本

    kubectl completion bash > /etc/bash_completion.d/kubectl

    所有master配置etcdctl

    复制出容器里的etcdctl

    docker cp `docker ps -a | awk '/k8s_etcd/{print $1}'`:/usr/local/bin/etcdctl /usr/local/bin/etcdctl

    1.13还是具体哪个版本后k8s默认使用v3 api的etcd,这里我们配置下etcdctl的参数

    cat >/etc/profile.d/etcd.sh<<'EOF'
    ETCD_CERET_DIR=/etc/kubernetes/pki/etcd/
    ETCD_CA_FILE=ca.crt
    ETCD_KEY_FILE=healthcheck-client.key
    ETCD_CERT_FILE=healthcheck-client.crt
    ETCD_EP=https://192.168.50.101:2379,https://192.168.50.102:2379,https://192.168.50.103:2379
    
    alias etcd_v2="etcdctl --cert-file ${ETCD_CERET_DIR}/${ETCD_CERT_FILE} 
                  --key-file ${ETCD_CERET_DIR}/${ETCD_KEY_FILE}  
                  --ca-file ${ETCD_CERET_DIR}/${ETCD_CA_FILE}  
                  --endpoints $ETCD_EP"
    
    alias etcd_v3="ETCDCTL_API=3 
        etcdctl   
       --cert ${ETCD_CERET_DIR}/${ETCD_CERT_FILE} 
       --key ${ETCD_CERET_DIR}/${ETCD_KEY_FILE} 
       --cacert ${ETCD_CERET_DIR}/${ETCD_CA_FILE} 
        --endpoints $ETCD_EP"
    EOF

    重新ssh下或者手动加载下环境变量. /etc/profile.d/etcd.sh

    [root@k8s-m1 ~]# etcd_v3 endpoint status --write-out=table
    +-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
    |          ENDPOINT           |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
    +-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
    | https://192.168.50.101:2379 | 9fdaf6a25119065e |   3.4.7 |  3.1 MB |     false |      false |         5 |     305511 |             305511 |        |
    | https://192.168.50.102:2379 | a3d9d41cf6d05e08 |   3.4.7 |  3.1 MB |      true |      false |         5 |     305511 |             305511 |        |
    | https://192.168.50.103:2379 | 3b34476e501895d4 |   3.4.7 |  3.0 MB |     false |      false |         5 |     305511 |             305511 |        |
    +-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

    配置etcd备份脚本

    mkdir -p /opt/etcd
    cat>/opt/etcd/etcd_cron.sh<<'EOF'
    #!/bin/bash
    set -e
    
    export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
    
    :  ${bak_dir:=/root/} #缺省备份目录,可以修改成存在的目录
    :  ${cert_dir:=/etc/kubernetes/pki/etcd/}
    :  ${endpoints:=https://192.168.50.101:2379,https://192.168.50.102:2379,https://192.168.50.103:2379}
    
    bak_prefix='etcd-'
    cmd_suffix='date +%Y-%m-%d-%H:%M'
    bak_suffix='.db'
    
    #将规范化后的命令行参数分配至位置参数($1,$2,...)
    temp=`getopt -n $0 -o c:d: -u -- "$@"`
    
    [ $? != 0 ] && {
        echo '
    Examples:
      # just save once
      bash $0 /tmp/etcd.db
      # save in contab and  keep 5
      bash $0 -c 5
        '
        exit 1
        }
    set -- $temp
    
    
    # -c 备份保留副本数量
    # -d 指定备份存放目录
    while true;do
        case "$1" in
            -c)
                [ -z "$bak_count" ] && bak_count=$2
                printf -v null %d "$bak_count" &>/dev/null || 
                    { echo 'the value of the -c must be number';exit 1; }
                shift 2
                ;;
            -d)
                [ ! -d "$2" ] && mkdir -p $2
                bak_dir=$2
                shift 2
                ;;
             *)
                [[ -z "$1" || "$1" == '--' ]] && { shift;break; }
                echo "Internal error!"
                exit 1
                ;;
        esac
    done
    
    
    function etcd_v2(){
    
        etcdctl --cert-file $cert_dir/healthcheck-client.crt 
                --key-file  $cert_dir/healthcheck-client.key 
                --ca-file   $cert_dir/ca.crt 
            --endpoints $endpoints $@
    }
    
    function etcd_v3(){
    
        ETCDCTL_API=3 etcdctl   
           --cert $cert_dir/healthcheck-client.crt 
           --key  $cert_dir/healthcheck-client.key 
           --cacert $cert_dir/ca.crt 
           --endpoints $endpoints $@
    }
    
    etcd::cron::save(){
        cd $bak_dir/
        etcd_v3 snapshot save  $bak_prefix$($cmd_suffix)$bak_suffix
        rm_files=`ls -t $bak_prefix*$bak_suffix | tail -n +$[bak_count+1]`
        if [ -n "$rm_files" ];then
            rm -f $rm_files
        fi
    }
    
    main(){
        [ -n "$bak_count" ] && etcd::cron::save || etcd_v3 snapshot save $@
    }
    
    main $@
    EOF

    crontab -e添加下面内容自动保留四个备份副本

    bash /opt/etcd/etcd_cron.sh  -c 4 -d /opt/etcd/ &>/dev/null

    node

    按照前面的做:

    • 配置系统设置
    • 设置hostname
    • 安装docker-ce
    • 设置hosts和nginx
    • 配置软件源,安装kubeadm kubelet

    和master的join一样,提前准备好环境和docker,然后join的时候不需要带--control-plane,只有一个master的话join的那个ip写controlPlaneEndpoint的值

    kubeadm join apiserver.k8s.local:8443 --token vo6qyo.4cm47w561q9p830v 
        --discovery-token-ca-cert-hash sha256:46e177c317037a4815c6deaab8089da4340663efeeead40810d4f53239256671
    [root@k8s-m1 ~]# kubectl get node
    NAME        STATUS   ROLES    AGE    VERSION
    k8s-m1      Ready    master   23h    v1.18.5
    k8s-m2      Ready    master   23h    v1.18.5
    k8s-m3      Ready    master   23h    v1.18.5
    k8s-node1   Ready    node     23h    v1.18.5
    k8s-node2   Ready    node     121m   v1.18.5
    k8s-node3   Ready    node     82m    v1.18.5

    addon(此章开始到结尾选取任意一个master上执行)

    容器的网络还没处理好,coredns无法分配到ip会处于pending状态,这里我用flannel部署,如果你了解bgp可以使用calico
    yaml文件来源与flannel官方github https://github.com/coreos/flannel/tree/master/Documentation

    修改

    • 如果是在1.16之前使用psp,policy/v1beta1得修改成extensions/v1beta1;这里不用修改

    apiVersion: policy/v1beta1
    kind: PodSecurityPolicy

    - rbac的version改为下面,不要使用v1beta1了,使用下面命令修改

    sed -ri '/apiVersion: rbac/s#v1.+#v1#' kube-flannel.yml

    - 官方yaml自带了四种架构的daemonset,我们删掉除了amd64以外的,大概是227行到结尾

    sed -ri '227,$d' kube-flannel.yml

    - pod的cidr修改了的话这里也要修改,如果是在同一个二层,可以使用把vxlan改为性能更强的host-gw模式,vxlan的话需要安全组放开8472端口的udp

    net-conf.json: |
      {
        "Network": "10.244.0.0/16",
        "Backend": {
          "Type": "vxlan"
        }
      }

    - 修改limits,需要大于request

    limits:
      cpu: "200m"
      memory: "100Mi"

    部署flannel

    貌似没有遇到这个错误

    1.15后node的cidr是数组,而不是单个了,flannel目前0.11和之前版本部署的话会有下列错误,见文档
    https://github.com/kubernetes/kubernetes/blob/v1.15.0/staging/src/k8s.io/api/core/v1/types.go#L3890-L3893
    https://github.com/kubernetes/kubernetes/blob/v1.18.2/staging/src/k8s.io/api/core/v1/types.go#L4206-L4216

    Error registering network: failed to acquire lease: node "xxx" pod cidr not assigned

    手动打patch,后续扩的node也记得打下

    nodes=`kubectl get node --no-headers | awk '{print $1}'`
    for node in $nodes;do
        cidr=`kubectl get node "$node" -o jsonpath='{.spec.podCIDRs[0]}'`
        [ -z "$(kubectl get node $node -o jsonpath='{.spec.podCIDR}')" ] && {
            kubectl patch node "$node" -p '{"spec":{"podCIDR":"'"$cidr"'"}}' 
        }
    done

    最终的kube-flannel.yml如下:

    [root@k8s-m1 ~]# cat kube-flannel.yml 
    ---
    apiVersion: policy/v1beta1
    kind: PodSecurityPolicy
    metadata:
      name: psp.flannel.unprivileged
      annotations:
        seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
        seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
        apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
        apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
    spec:
      privileged: false
      volumes:
        - configMap
        - secret
        - emptyDir
        - hostPath
      allowedHostPaths:
        - pathPrefix: "/etc/cni/net.d"
        - pathPrefix: "/etc/kube-flannel"
        - pathPrefix: "/run/flannel"
      readOnlyRootFilesystem: false
      # Users and groups
      runAsUser:
        rule: RunAsAny
      supplementalGroups:
        rule: RunAsAny
      fsGroup:
        rule: RunAsAny
      # Privilege Escalation
      allowPrivilegeEscalation: false
      defaultAllowPrivilegeEscalation: false
      # Capabilities
      allowedCapabilities: ['NET_ADMIN']
      defaultAddCapabilities: []
      requiredDropCapabilities: []
      # Host namespaces
      hostPID: false
      hostIPC: false
      hostNetwork: true
      hostPorts:
      - min: 0
        max: 65535
      # SELinux
      seLinux:
        # SELinux is unused in CaaSP
        rule: 'RunAsAny'
    ---
    kind: ClusterRole
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: flannel
    rules:
      - apiGroups: ['extensions']
        resources: ['podsecuritypolicies']
        verbs: ['use']
        resourceNames: ['psp.flannel.unprivileged']
      - apiGroups:
          - ""
        resources:
          - pods
        verbs:
          - get
      - apiGroups:
          - ""
        resources:
          - nodes
        verbs:
          - list
          - watch
      - apiGroups:
          - ""
        resources:
          - nodes/status
        verbs:
          - patch
    ---
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: flannel
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: flannel
    subjects:
    - kind: ServiceAccount
      name: flannel
      namespace: kube-system
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: flannel
      namespace: kube-system
    ---
    kind: ConfigMap
    apiVersion: v1
    metadata:
      name: kube-flannel-cfg
      namespace: kube-system
      labels:
        tier: node
        app: flannel
    data:
      cni-conf.json: |
        {
          "name": "cbr0",
          "cniVersion": "0.3.1",
          "plugins": [
            {
              "type": "flannel",
              "delegate": {
                "hairpinMode": true,
                "isDefaultGateway": true
              }
            },
            {
              "type": "portmap",
              "capabilities": {
                "portMappings": true
              }
            }
          ]
        }
      net-conf.json: |
        {
          "Network": "10.244.0.0/16",
          "Backend": {
            "Type": "host-gw"
          }
        }
    ---
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: kube-flannel-ds-amd64
      namespace: kube-system
      labels:
        tier: node
        app: flannel
    spec:
      selector:
        matchLabels:
          app: flannel
      template:
        metadata:
          labels:
            tier: node
            app: flannel
        spec:
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                  - matchExpressions:
                      - key: kubernetes.io/os
                        operator: In
                        values:
                          - linux
                      - key: kubernetes.io/arch
                        operator: In
                        values:
                          - amd64
          hostNetwork: true
          tolerations:
          - operator: Exists
            effect: NoSchedule
          serviceAccountName: flannel
          initContainers:
          - name: install-cni
            image: quay.io/coreos/flannel:v0.12.0-amd64
            command:
            - cp
            args:
            - -f
            - /etc/kube-flannel/cni-conf.json
            - /etc/cni/net.d/10-flannel.conflist
            volumeMounts:
            - name: cni
              mountPath: /etc/cni/net.d
            - name: flannel-cfg
              mountPath: /etc/kube-flannel/
          containers:
          - name: kube-flannel
            image: quay.io/coreos/flannel:v0.12.0-amd64
            command:
            - /opt/bin/flanneld
            args:
            - --ip-masq
            - --kube-subnet-mgr
            resources:
              requests:
                cpu: "100m"
                memory: "50Mi"
              limits:
                cpu: "200m"
                memory: "100Mi"
            securityContext:
              privileged: false
              capabilities:
                add: ["NET_ADMIN"]
            env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            volumeMounts:
            - name: run
              mountPath: /run/flannel
            - name: flannel-cfg
              mountPath: /etc/kube-flannel/
          volumes:
            - name: run
              hostPath:
                path: /run/flannel
            - name: cni
              hostPath:
                path: /etc/cni/net.d
            - name: flannel-cfg
              configMap:
                name: kube-flannel-cfg

    这里采用了host-gw模式,因为遇到了udp的内核bug,详细请参考:https://zhangguanzhang.github.io/2020/05/23/k8s-vxlan-63-timeout/

    kubectl apply -f kube-flannel.yml

    验证集群可用性

    kubectl -n kube-system get pod -o wide

    等待kube-system空间下的pod都是running后我们来测试下集群可用性

    cat<<EOF | kubectl apply -f -
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx
    spec:
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - image: nginx:alpine
            name: nginx
            ports:
            - containerPort: 80
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx
    spec:
      selector:
        app: nginx
      ports:
        - protocol: TCP
          port: 80
          targetPort: 80
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: busybox
      namespace: default
    spec:
      containers:
      - name: busybox
        image: zhangguanzhang/centos
        command:
          - sleep
          - "3600"
        imagePullPolicy: IfNotPresent
      restartPolicy: Always
    EOF

    等待pod running

    验证集群dns

    $ kubectl exec -ti busybox -- nslookup kubernetes
    Server:    10.96.0.10
    Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
    
    Name:      kubernetes
    Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

    关于kubeadm过程和更多详细参数选项见下面文章

    新添加节点

    1. 初始化centos 7 

    初始化脚本

    #!/bin/bash
    
    #----配置时间统一性----
    echo "配置时间"
    yum install chrony -y
    mv /etc/chrony.conf /etc/chrony.conf.bak
    cat>/etc/chrony.conf<<EOF
    server ntp.aliyun.com iburst
    stratumweight 0
    driftfile /var/lib/chrony/drift
    rtcsync
    makestep 10 3
    bindcmdaddress 127.0.0.1
    bindcmdaddress ::1
    keyfile /etc/chrony.keys
    commandkey 1
    generatecommandkey
    logchange 0.5
    logdir /var/log/chrony
    EOF
    /usr/bin/systemctl enable chronyd
    /usr/bin/systemctl restart chronyd
    
    #---关闭交换分区---
    echo "关闭交换分区"
    swapoff -a && sysctl -w vm.swappiness=0
    sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab
    
    #---关闭防火墙以及selinux---
    echo "关闭防火墙以及selinux"
    systemctl stop firewalld
    systemctl disable firewalld
    setenforce 0
    sed -ri '/^[^#]*SELINUX=/s#=.+$#=disabled#' /etc/selinux/config
    
    #---关闭NetworkManager---
    echo "关闭NetworkManager"
    systemctl disable NetworkManager
    systemctl stop NetworkManager
    
    #---安装epel源,并且替换为阿里云的epel源---
    yum install epel-release wget -y
    wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
    
    #---安装依赖组件---
    echo "安装依赖组件"
    yum install -y 
        curl 
        git 
        conntrack-tools 
        psmisc 
        nfs-utils 
        jq 
        socat 
        bash-completion 
        ipset 
        ipvsadm 
        conntrack 
        libseccomp 
        net-tools 
        crontabs 
        sysstat 
        unzip 
        iftop 
        nload 
        strace 
        bind-utils 
        tcpdump 
        telnet 
        lsof 
        htop
    #---ipvs模式需要开机加载下列模块---
    echo "ipvs模式需要开机加载下列模块"
    cat>/etc/modules-load.d/ipvs.conf<<EOF
    ip_vs
    ip_vs_rr
    ip_vs_wrr
    ip_vs_sh
    nf_conntrack
    br_netfilter
    EOF
    systemctl daemon-reload
    systemctl enable --now systemd-modules-load.service
    
    #---设定系统参数---
    cat <<EOF > /etc/sysctl.d/k8s.conf
    net.ipv6.conf.all.disable_ipv6 = 1
    net.ipv6.conf.default.disable_ipv6 = 1
    net.ipv6.conf.lo.disable_ipv6 = 1
    net.ipv4.neigh.default.gc_stale_time = 120
    net.ipv4.conf.all.rp_filter = 0
    net.ipv4.conf.default.rp_filter = 0
    net.ipv4.conf.default.arp_announce = 2
    net.ipv4.conf.lo.arp_announce = 2
    net.ipv4.conf.all.arp_announce = 2
    net.ipv4.ip_forward = 1
    net.ipv4.tcp_max_tw_buckets = 5000
    net.ipv4.tcp_syncookies = 1
    net.ipv4.tcp_max_syn_backlog = 1024
    net.ipv4.tcp_synack_retries = 2
    # 要求iptables不对bridge的数据进行处理
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    net.bridge.bridge-nf-call-arptables = 1
    net.netfilter.nf_conntrack_max = 2310720
    fs.inotify.max_user_watches=89100
    fs.may_detach_mounts = 1
    fs.file-max = 52706963
    fs.nr_open = 52706963
    vm.overcommit_memory=1
    vm.panic_on_oom=0
    # https://github.com/moby/moby/issues/31208 
    # ipvsadm -l --timout
    # 修复ipvs模式下长连接timeout问题 小于900即可
    net.ipv4.tcp_keepalive_time = 600
    net.ipv4.tcp_keepalive_intvl = 30
    net.ipv4.tcp_keepalive_probes = 10
    EOF
    sysctl --system
    
    #---优化设置 journal 日志相关---
    sed -ri 's/^$ModLoad imjournal/#&/' /etc/rsyslog.conf
    sed -ri 's/^$IMJournalStateFile/#&/' /etc/rsyslog.conf
    sed -ri 's/^#(DefaultLimitCORE)=/1=100000/' /etc/systemd/system.conf
    sed -ri 's/^#(DefaultLimitNOFILE)=/1=100000/' /etc/systemd/system.conf
    sed -ri 's/^#(UseDNS )yes/1no/' /etc/ssh/sshd_config
    
    #---优化文件最大打开数---
    cat>/etc/security/limits.d/kubernetes.conf<<EOF
    *       soft    nproc   131072
    *       hard    nproc   131072
    *       soft    nofile  131072
    *       hard    nofile  131072
    root    soft    nproc   131072
    root    hard    nproc   131072
    root    soft    nofile  131072
    root    hard    nofile  131072
    EOF
    
    #---设置user_namespace.enable=1---
    grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"

    2. 编译安装nginx

    yum install gcc gcc-c++ -y
    tar zxvf nginx-1.16.1.tar.gz
    cd nginx-1.16.1/
    ./configure --with-stream --without-http --prefix=/usr/local/kube-nginx --without-http_uwsgi_module --without-http_scgi_module --without-http_fastcgi_module
    make && make install
    groupadd nginx
    useradd -r -g nginx nginx
    systemctl daemon-reload && systemctl enable kube-nginx && systemctl restart kube-nginx

    3. 重新生成tocken

    kubeadm token create
    openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
    kubeadm join apiserver.k8s.local:8443 --token 8ceduc.cy0r23j2hpsw80ff     --discovery-token-ca-cert-hash sha256:46e177c317037a4815c6deaab8089da4340663efeeead40810d4f53239256671

    error execution phase preflight: couldn't validate the identity of the API Server: could not find a JWS signature in the cluster-info ConfigMap for token ID "vo6qyo"

    此时就需要重新生成tocken。

  • 相关阅读:
    HUST 1372 marshmallow
    HUST 1371 Emergency relief
    CodeForces 629D Babaei and Birthday Cake
    CodeForces 629C Famil Door and Brackets
    ZOJ 3872 Beauty of Array
    ZOJ 3870 Team Formation
    HDU 5631 Rikka with Graph
    HDU 5630 Rikka with Chess
    CodeForces 626D Jerry's Protest
    【POJ 1964】 City Game
  • 原文地址:https://www.cnblogs.com/skymyyang/p/13279006.html
Copyright © 2011-2022 走看看