zoukankan      html  css  js  c++  java
  • k8s部署文档

    k8s部署文档

    一、文档简介

    作者:lanjx

    邮箱:lanheader@163.com

    博客地址:https://www.cnblogs.com/lanheader/

    更新时间:2021-07-09

    二、使用kubeadm部署文档

    注意:所有执行无特殊说明都需要在所有节点(k8s-master 和 k8s-node)上执行

    1、环境准备

    准备三台主机(根据自己的情况进行设置)

    192.168.8.158 master

    192.168.8.159 node1

    192.168.8.160 node2

    1.1、主机名设置

    hostname master                                        
    hostname node1                                        
    hostname node2 
    

    1.2、关闭防火墙

    $ systemctl stop firewalld.service 
    $ systemctl disable firewalld.service 
    $ yum upgrade 
    

    1.3、关闭swap

    注意:kubernetes1.8开始不关闭swap无法启动

    $ swapoff -a
    $ cp /etc/fstab /etc/fstab_bak
    $ cat /etc/fstab_bak |grep -v swap > /etc/fstab
    $ cat /etc/fstab
    
    # /etc/fstab
    # Created by anaconda on Tue Jul 21 11:51:16 2020
    #
    # Accessible filesystems, by reference, are maintained under '/dev/disk'
    # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
    #
    /dev/mapper/centos_virtual--machine-root /                       xfs     defaults        0 0
    UUID=1694f89b-5c62-4a4a-9c86-46c3f202e4f6 /boot                   xfs     defaults        0 0
    /dev/mapper/centos_virtual--machine-home /home                   xfs     defaults        0 0
    #/dev/mapper/centos_virtual--machine-swap swap                    swap    defaults        0 0
    

    1.4、修改iptables参数

    RHEL / CentOS 7上的一些用户报告了由于iptables被绕过而导致流量路由不正确的问题。创建/etc/sysctl.d/k8s.conf文件,添加如下内容:

    $ cat <<EOF >  /etc/sysctl.d/k8s.conf
    vm.swappiness = 0
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    net.ipv4.ip_forward = 1
    EOF
    

    使配置生效

    $ modprobe br_netfilter 
    $ sysctl -p /etc/sysctl.d/k8s.conf 
    

    1.5、加载ipvs模块

     $cat > /etc/sysconfig/modules/ipvs.modules <<EOF
    
    #!/bin/bash                                                                      
    modprobe -- ip_vs                                                               
    modprobe -- ip_vs_rr                                                             
    modprobe -- ip_vs_wrr                                                           
    modprobe -- ip_vs_sh                                                            
    modprobe -- nf_conntrack_ipv4                                                    
    EOF
    

    这条命令有点长

    $ chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
    

    1.6、安装docker

    # 卸载旧版 docker
    $ docker stop `docker ps -a -q`
    $ docker rm `docker ps -a -q`
    $ docker rmi -f `docker images -a -q` //这里将会强制删除
    # 移除旧版本的软件信息
    $ yum -y remove docker docker-common container-selinux
    # 设置最新稳定版本的Docker仓库
    $ yum-config-manager 
        --add-repo 
        https://docs.docker.com/v1.13/engine/installation/linux/repo_files/centos/docker.repo
    # 安装Docker
    # 更新yum源
    $ yum makecache fast
    # 选择你要的Docker版本
    $ yum list docker-engine.x86_64  --showduplicates |sort -r
    $ yum -y install docker-engine-<VERSION_STRING>
    $ docker -v 
    # 启动
    $ systemctl start docker
    $ systemctl enable docker
    # 卸载
    $ yum -y remove docker-engine docker-engine-selinux
    

    1.7、创建共享存储

    如果选择使用nfs-server执行以下步骤

    注意:

    /data/k8s *(rw,sync,no_root_squash)执行这步,如果没有no_root_squash,pod启动会报错没有权限

    # 安装nfs组件
    $ yum -y install nfs-utils rpcbind
    # 创建nfs路径
    $ mkdir -p /data/k8s/
    # 配置路径权限
    $ chmod 755 /data/k8s/
    # 配置nfs参数
    $ vim /etc/exports
    /data/k8s  *(rw,sync,no_root_squash)
    # 启动服务
    $ systemctl start rpcbind.service
    $ systemctl enable rpcbind
    $ systemctl status rpcbind
    $ systemctl start nfs.service
    $ systemctl enable nfs
    $ systemctl status nfs
    # 分别在node服务器执行挂载
    $ showmount -e 192.168.1.109
    

    2、helm安装

    2.1、安装

    我们可以在Helm Realese页面下载二进制文件,这里下载的v2.10.0版本,解压后将可执行文件helm拷贝到/usr/local/bin目录下授权755即可,这样Helm客户端就在这台机器上安装完成了。

    2.2、验证

    可以使用Helm命令查看版本,会提示无法连接到服务端Tiller

    $ helm version
    Client: &version.Version{SemVer:"v2.10.0", GitCommit:"9ad53aac42165a5fadc6c87be0dea6b115f93090", GitTreeState:"clean"}
    Error: could not find tiller
    

    注意:要安装 Helm 的服务端程序,我们需要使用到kubectl工具,所以先确保kubectl工具能够正常的访问 kubernetes 集群的apiserver哦。

    然后我们在命令行中执行初始化操作:

    helm init
    

    由于 Helm 默认会去gcr.io拉取镜像,所以如果你当前执行的机器没有配置访问国外的话可以实现下面的命令代替:

    $ helm init --upgrade --tiller-image cnych/tiller:v2.10.0
    $HELM_HOME has been configured at /root/.helm.
    
    Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
    
    Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
    To prevent this, run `helm init` with the --tiller-tls-verify flag.
    For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
    Happy Helming!
    

    2.3、helm server镜像地址

    修改helm server镜像地址
    $ kubectl edit deployment tiller-deploy -n kube-system
    

    替换

    ...
    	spec:
          automountServiceAccountToken: true
          containers:
          - env:
            - name: TILLER_NAMESPACE
              value: kube-system
            - name: TILLER_HISTORY_MAX
              value: "0"
            #############################################  server镜像地址 #################################
            image: registry.cn-hangzhou.aliyuncs.com/hlc-k8s-gcr-io/tiller:v2.16.0
            imagePullPolicy: IfNotPresent
            livenessProbe:
              failureThreshold: 3
              httpGet:
                path: /liveness
                port: 44135
                scheme: HTTP
              initialDelaySeconds: 1
              periodSeconds: 10
              successThreshold: 1
              timeoutSeconds: 1
            name: tiller
    ...
    

    地址:registry.cn-hangzhou.aliyuncs.com/hlc-k8s-gcr-io/tiller:v2.16.0

    3、用kubeadm 部署 kubernetes

    3.1、安装kubeadm, kubelet

    注意:yum install 安装的时候一定要看一下kubernetes的版本号后面kubeadm init 的时候需要用到

    $ cat <<EOF > /etc/yum.repos.d/kubernetes.repo
    # 结果
    [kubernetes]
    name=Kubernetes
    baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
    enabled=1
    gpgcheck=1
    repo_gpgcheck=1
    gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
    exclude=kube* 
    EOF
    
    > **注意:这里一定要看一下版本号,因为 Kubeadm init 的时候 填写的版本号不能低于kuberenete版本(安装过程中会有显示)**
    
    ```shell
     $ yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
    

    注意:如果需要指定版本 用下面的命令kubelet-

    $ yum install kubelet-1.19.2 kubeadm-1.19.2 kubectl-1.19.2 --disableexcludes=Kubernetes
    

    3.2、启动 kubelet

    $  systemctl enable kubelet.service && systemctl start kubelet.service
    

    启动kubelet.service之后 我们查看一下kubelet状态是未启动状态,查看原因发现是 “/var/lib/kubelet/config.yaml”文件不存在,这里可以暂时先不用处理,当kubeadm init 之后会创建此文件

    我们在 k8s-master上用kubeadm ini初始化kubernetes

    注意:这里的kubernetes-version 一定要和上面安装的版本号一致 否则会报错

    kubeadm init 
    --apiserver-advertise-address=192.168.8.158 
    --image-repository registry.aliyuncs.com/google_containers 
    --kubernetes-version v1.21.2 
    --pod-network-cidr=10.244.0.0/16
    

    --apiserver-advertise-addres # 填写 k8s-master ip

    --image-repository # 镜像地址

    --kubernetes-version #关闭版本探测,因为它的默认值是stable-1,会从https://storage.googleapis.com/kubernetes-release/release/stable-1.txt下载最新的版本号,指定版本跳过网络请求,再次强调一定要和Kubernetes版本号一致

    kubeadm init 初始化信息, 我们看一下初始化过程发现自动创建了 "/var/lib/kubelet/config.yaml" 这个文件 (由于node 节点不需要执行kubeadm init 所以手动拷贝这个文件到节点/var/lib/kubelet/config.yaml)

    [init] Using Kubernetes version: v1.13.1
    [preflight] Running pre-flight checks
    
    ...
    
    certificates in the cluster
    [bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
    [addons] Applied essential addon: CoreDNS
    [addons] Applied essential addon: kube-proxy
    Your Kubernetes master has initialized successfully!
    To start using your cluster, you need to run the following as a regular user:
      #======这里是用时再使用集群之前需要执行的操作------ 
      mkdir -p $HOME/.kube
      sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 
      sudo chown $(id -u):$(id -g) $HOME/.kube/config
    You should now deploy a pod network to the cluster.
    Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
      https://kubernetes.io/docs/concepts/cluster-administration/addons/
    You can now join any number of machines by running the following on each node             
    as root:             
     #=====这是增加节点的方法 token过期 请参考问题集锦------                       
      kubeadm join 10.211.55.6:6443 --token sfaff2.iet15233unw5jzql --discovery-token-ca-cert-hash  sha256:f798c5be53416ca3b5c7475ee0a4199eb26f9e31ee7106699729c0660a70f8d7
    

    初始化成功后会提示在使用之前需要再配置一下,配置方法已经给出,另外会生成一个临时token以及增加节点的方法

    普通用户要使用k8s 需要执行下面操作:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    

    如果是root 可以直接执行

    export KUBECONFIG=/etc/kubernetes/admin.conf
    

    以上两个二选一即可,这里我是直接用的root 所以直接执行

    export KUBECONFIG=/etc/kubernetes/admin.conf  
    

    现在我们查看一下 kubelet 的状态 已经是 running 状态 ,启动成功

    查看状态,确认每个 组件都是 Healthy 状态

     kubectl get cs
    
    NAME                 STATUS    MESSAGE              ERROR                    
    scheduler            Healthy   ok
    controller-manager   Healthy   ok
    etcd-0               Healthy   {"health": "true"}
    

    查看node状态

     kubectl get node
    
    NAME     STATUS     ROLES    AGE   VERSION                                                
    centos   NotReady   master   11m   v1.19.2 
    
    安装完k8s集群之后很可能会出现一下情况:
    $ kubectl get cs
    NAME                 STATUS      MESSAGE                                                                                     ERROR
    scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
    controller-manager   Unhealthy   Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
    etcd-0               Healthy     {"health":"true"}
    

    出现这种情况是kube-controller-manager.yaml和kube-scheduler.yaml设置的默认端口是0,在文件中注释掉就可以了。(每台master节点都要执行操作)

    1.修改kube-scheduler.yaml文件

    注释 - --port=0

    vim /etc/kubernetes/manifests/kube-scheduler.yaml
    
    apiVersion: v1
    kind: Pod
    metadata:
      creationTimestamp: null
      labels:
        component: kube-scheduler
        tier: control-plane
      name: kube-scheduler
      namespace: kube-system
    spec:
      containers:
      - command:
        - kube-scheduler
        - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
        - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
        - --bind-address=127.0.0.1
        - --kubeconfig=/etc/kubernetes/scheduler.conf
        - --leader-elect=true
    #    - --port=0                  ## 注释掉这行
        image: k8s.gcr.io/kube-scheduler:v1.18.6
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 8
          httpGet:
            host: 127.0.0.1
            path: /healthz
            port: 10259
            scheme: HTTPS
          initialDelaySeconds: 15
          timeoutSeconds: 15
        name: kube-scheduler
        resources:
          requests:
            cpu: 100m
        volumeMounts:
        - mountPath: /etc/kubernetes/scheduler.conf
          name: kubeconfig
          readOnly: true
      hostNetwork: true
      priorityClassName: system-cluster-critical
      volumes:
      - hostPath:
          path: /etc/kubernetes/scheduler.conf
          type: FileOrCreate
        name: kubeconfig
    status: {}
    
    2.修改kube-controller-manager.yaml文件
    vim /etc/kubernetes/manifests/kube-controller-manager.yaml
    
    apiVersion: v1
    kind: Pod
    metadata:
      creationTimestamp: null
      labels:
        component: kube-controller-manager
        tier: control-plane
      name: kube-controller-manager
      namespace: kube-system
    spec:
      containers:
      - command:
        - kube-controller-manager
        - --allocate-node-cidrs=true
        - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
        - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
        - --bind-address=127.0.0.1
        - --client-ca-file=/etc/kubernetes/pki/ca.crt
        - --cluster-cidr=10.244.0.0/16
        - --cluster-name=kubernetes
        - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
        - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
        - --controllers=*,bootstrapsigner,tokencleaner
        - --kubeconfig=/etc/kubernetes/controller-manager.conf
        - --leader-elect=true
        - --node-cidr-mask-size=24
    #    - --port=0                    ## 注释掉这行
        - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
        - --root-ca-file=/etc/kubernetes/pki/ca.crt
        - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
        - --service-cluster-ip-range=10.96.0.0/12
        - --use-service-account-credentials=true
        image: k8s.gcr.io/kube-controller-manager:v1.18.6
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 8
          httpGet:
            host: 127.0.0.1
            path: /healthz
            port: 10257
            scheme: HTTPS
          initialDelaySeconds: 15
          timeoutSeconds: 15
        name: kube-controller-manager
        resources:
          requests:
            cpu: 200m
        volumeMounts:
        - mountPath: /etc/ssl/certs
          name: ca-certs
          readOnly: true
        - mountPath: /etc/pki
          name: etc-pki
          readOnly: true
        - mountPath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
          name: flexvolume-dir
        - mountPath: /etc/kubernetes/pki
          name: k8s-certs
          readOnly: true
        - mountPath: /etc/kubernetes/controller-manager.conf
          name: kubeconfig
          readOnly: true
      hostNetwork: true
      priorityClassName: system-cluster-critical
      volumes:
      - hostPath:
          path: /etc/ssl/certs
          type: DirectoryOrCreate
        name: ca-certs
      - hostPath:
          path: /etc/pki
          type: DirectoryOrCreate
        name: etc-pki
      - hostPath:
          path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
          type: DirectoryOrCreate
        name: flexvolume-dir
      - hostPath:
          path: /etc/kubernetes/pki
          type: DirectoryOrCreate
        name: k8s-certs
      - hostPath:
          path: /etc/kubernetes/controller-manager.conf
          type: FileOrCreate
        name: kubeconfig
    status: {}
    
    3.每台master重启kubelet
    $ systemctl restart kubelet.service
    
    4.再次查看状态
    $ kubectl get cs
    NAME                 STATUS    MESSAGE             ERROR
    scheduler            Healthy   ok
    controller-manager   Healthy   ok
    etcd-0               Healthy   {"health":"true"}
    

    3.3、安装port Network( flannel )

    $ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    

    如果访问不了,自己想办法。。。。

    3.4、安装storageClass

    $ git clone https://github.com/helm/charts.git
    $ cd charts/
    $ helm install stable/nfs-client-provisioner --set nfs.server=192.168.1.109 --set nfs.path=/data/k8s
    

    注意:地址下载不下来想办法。。。。

    3.5、K8s 补全命令:

    $ yum install -y bash-completion
    $ source /usr/share/bash-completion/bash_completion
    $ source <(kubectl completion bash)
    $ echo "source <(kubectl completion bash)" >> ~/.bashrc
    

    3.6、安装部署时出现的问题

    3.6.1 集群DNS组件拉取问题:
    Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.13-0
    
    failed to pull image "registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns:v1.8.0": output: Error response from daemon: pull access denied for registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
    
    , error: exit status 1
    
    原因:

    kubernetes v1.21.1 安装时需要从 k8s.gcr.io 拉取镜像,但是该网站被我国屏蔽了,国内没法正常访问导致没法正常安装。

    这里通过介绍从Docker官方默认镜像平台拉取镜像并重新打tag的方式来绕过对 k8s.gcr.io 的访问。

    解决方案:

    手动下载镜像

    $ docker pull coredns/coredns
    

    查看kubeadm需要镜像,并修改名称

    $ kubeadm config images list --config new.yaml
    

    查看镜像

    $ docker images
    

    打标签,修改名称

    $ docker tag coredns/coredns:latest registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns:v1.8.0
    

    删除多余镜像

    $ docker rmi coredns/coredns:latest
    
    3.6.2、kubelet 启动不了

    查看kubelet状态

    systemctl status kubelet.service 
    

    输出如下:

    ● kubelet.service - kubelet: The Kubernetes Node Agent
       Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset:disabled)
       Drop-In: /usr/lib/systemd/system/kubelet.service.d 
               └─10-kubeadm.conf
       Active: activating (auto-restart) (Result: exit-code) since 日 2019-03-31 16:18:55 CST;7s ago
         Docs: https://kubernetes.io/docs/
      Process: 4564 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
     Main PID: 4564 (code=exited, status=255)
    3月 31 16:18:55 k8s-node systemd[1]: Unit kubelet.service entered failed state.
    3月 31 16:18:55 k8s-node systemd[1]: kubelet.service failed.
    

    查看出错信息

    journalctl -xefu kubelet
    
    3月 31 16:19:46 k8s-node systemd[1]: kubelet.service holdoff time over,scheduling restart.
    3月 31 16:19:46 k8s-node systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
    -- Subject: Unit kubelet.service has finished shutting down
    -- Defined-By: systemd                              
    -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
    -- Unit kubelet.service has finished shutting down.
    3月 31 16:19:46 k8s-node systemd[1]: Started kubelet: The Kubernetes Node Agent
    -- Subject: Unit kubelet.service has finished start-up
    -- Defined-By: systemd
    -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
    -- Unit kubelet.service has finished starting up.
    -- The start-up result is done.
    ######注意以下报错内容:
    3月 31 16:19:46 k8s-node kubelet[4611]: F0331 16:19:46.989588 4611 server.go:193] failed to load Kubelet config file       /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory
    3月 31 16:19:46 k8s-node systemd[1]: kubelet.service: main process exited,code=exited,status=255/n/a        
    #######
    3月 31 16:19:46 k8s-node systemd[1]: Unit kubelet.service entered failed state.
    3月 31 16:19:46 k8s-node systemd[1]: kubelet.service failed.
    

    报错/var/lib/kubelet/config.yaml不存在,执行3.2初始化操作

    三、安装rancher

    注意:云服务器使用nodeport方式,svc启动之后修改为nodeport。

    1、 创建 Namespace

    $ kubectl create namespace cattle-system
    

    2、安装 cert-manager

    # 安装 CustomResourceDefinition 资源
    kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.0/cert-manager.crds.yaml
    # **重要:**# 如果您正在运行 Kubernetes v1.15 或更低版本,
    # 则需要在上方的 kubectl apply 命令中添加`--validate=false`标志,
    # 否则您将在 cert-manager 的 CustomResourceDefinition 资源中收到与
    # x-kubernetes-preserve-unknown-fields 字段有关的验证错误。
    # 这是一个良性错误,是由于 kubectl 执行资源验证的方式造成的。
    
    # 为 cert-manager 创建命名空间
    
    kubectl create namespace cert-manager
    
    # 添加 Jetstack Helm 仓库
    
    helm repo add jetstack https://charts.jetstack.io
    
    # 更新本地 Helm chart 仓库缓存
    
    helm repo update
    
    # 安装 cert-manager Helm chart
    
    helm install  
       cert-manager jetstack/cert-manager  
       --namespace cert-manager  
       --version v0.15.0
    

    4、安装rancher

    # 添加rancher源
    $ helm repo add rancher https://releases.rancher.com/server-charts/
    # 安装最新版rancher
    $ helm install rancher rancher-stable/rancher –namespace cattle-system 
    
  • 相关阅读:
    设计模式(十六):职责链模式
    设计模式(十五):状态模式
    设计模式(十四):命令模式
    设计模式(十三):模板模式
    设计模式(十二):观察者模式
    远程连接数据库常出现的错误解析
    [解决] Error Code: 1044. Access denied for user 'root'@'%' to database
    linux常用命令
    linux上svn项目管理,同步服务器,用户管理
    linux 磁盘分区
  • 原文地址:https://www.cnblogs.com/lanheader/p/14153822.html
Copyright © 2011-2022 走看看