zoukankan      html  css  js  c++  java
  • k8s-基于阿里云服务器使用kubeadm搭建k8s集群

    学习内容总结来自B站UP主"尚硅谷"的Kubernetes(k8s)教学视频: https://www.bilibili.com/video/BV1w4411y7Go

    k8s-基于阿里云服务器使用kubeadm搭建k8s集群

    本人也是新手学习k8s, 先搭建一个比较简单的1主2从的集群, 这些服务器当然不能体现k8s的威力, 但是由于新手上路, 先搞个简单的集群试试看, 后面熟练了再使用更多的服务器尝试.

    购买阿里云服务器

    这里先购买三台阿里云服务器, 但是买的都是按量付费类型的, 即按其规定步骤停机后可以不收取费用 (停机再重启后不会影响已经搭建好的集群结构)

    购买链接为:

    https://ecs-buy.aliyun.com/wizard?spm=5176.ecssimplebuy.header.1.15fd36751sf2fA#/prepay/cn-shanghai

    已经购买过阿里云服务器的话, 也可以在控制台的点击创建实例, 进入

    image-20200801193819302

    选择服务器

    • 选择按量付费
    • 选择离你比较近的地域
    • 这里选择了2和2G的突发性能实例 t6 (ecs.t6-c1m1.large)(先用着看看, 不确定配置是否够用)
    • 数量3台
    • 系统镜像选择了64位Centos8.0(docker安装要求 Centos7.0以上)
    • 磁盘选择了40G

    综上费用为¥ 0.413 /时

    image-20200801195033902

    网络和安全组

    • 网络选择默认
    • 宽带计费模式选择"按使用流量"计费, 峰值可以随便选(如40M), 即用了多少扣多少钱
    • 其他默认即可

    系统配置

    • 登录凭证选择"自定义密码", 可以给root设置统一的密码, 好管理
    • 其他默认即可

    分组配置

    • 默认即可

    确认配置, 创建订单

    image-20200801195646559

    初始化服务器设置(三台都要)

    为了方便管理, 将服务器的实例名称改成: k8s-master01-225/k8s-node01-228/k8s-node02-229(其中225/228/229是私网IP的最后三位, 命名规则可以自行定义)

    使用xshell工具连接三个服务器

    image-20200801200231430

    测试一下三个服务器可以通过私网相互ping通, 后面使用私网连接而不用公网, 因为公网流量要钱

    修改主机名称

    # k8s-master01-225 机器上
    hostnamectl set-hostname k8s-master01-225
    # k8s-node01-228 机器上
    hostnamectl set-hostname k8s-node01-228
    # k8s-node02-229 机器上
    hostnamectl set-hostname k8s-node02-229
    

    设置/etc/hosts文件

    真正的集群应该是使用自己搭建的dns服务器来进行IP和域名绑定, 这里处于简单考虑, 就直接使用hosts文件关联IP和主机名了, 在三台服务的/etc/hosts文件中添加相同的三句话

    172.19.199.225 k8s-master01-225
    172.19.188.228 k8s-node01-228
    172.19.188.229 k8s-node02-229
    

    xshel有个强大功能是能输入一个命令同时控制多个终端, 在其中一个终端中右键, 选择"发送键输入到所有会话", 这样不用一个一个服务器取运行了, 不过要注意有时候只需要某一个服务器运行的命令时, 不要忘了把公用命令的设置去掉

    image-20200801201646328

    安装依赖包

    yum install -y conntrack ipvsadm ipset jq iptables curl sysstat libseccomp wget vim net-tools git
    

    关闭防火墙

    systemctl stop firewalld &&  systemctl  disable firewal
    

    安装设置Iptables规则为空

    yum -y install iptables-services  &&  systemctl  start iptables  &&  systemctl  enable iptables&&  iptables -F  &&  service iptables save
    

    关闭swap分区

    不关闭的话, pod容器可能运行在swap(虚拟内存)中, 影响效率

    swapoff -a && sed -i '/ swap / s/^(.*)$/#1/g' /etc/fstab
    

    关闭selinux

    setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
    

    针对K8S调整内核参数

    编辑配置文件

    cat > kubernetes.conf <<EOF
    net.bridge.bridge-nf-call-iptables=1 # 开启网桥模式
    net.bridge.bridge-nf-call-ip6tables=1 # 开启网桥模式
    net.ipv4.ip_forward=1
    net.ipv4.tcp_tw_recycle=0
    vm.swappiness=0 # 禁止使用 swap 空间,只有当系统 OOM 时才允许使用它
    vm.overcommit_memory=1 # 不检查物理内存是否够用
    vm.panic_on_oom=0 # 开启 OOM
    fs.inotify.max_user_instances=8192
    fs.inotify.max_user_watches=1048576
    fs.file-max=52706963
    fs.nr_open=52706963
    net.ipv6.conf.all.disable_ipv6=1 # 关闭IPV6协议
    net.netfilter.nf_conntrack_max=2310720
    EOF
    

    生效配置文件

    cp kubernetes.conf  /etc/sysctl.d/kubernetes.conf
    sysctl -p /etc/sysctl.d/kubernetes.conf
    

    调整系统时区(时区正常的可以不用设置)

    # 设置系统时区为中国/上海
    timedatectl set-timezone Asia/Shanghai
    # 将当前的 UTC 时间写入硬件时钟
    timedatectl set-local-rtc 0
    # 重启依赖于系统时间的服务
    systemctl restart rsyslog
    systemctl restart crond
    

    关闭系统不需要的服务(如果有的话)

    systemctl stop postfix && systemctl disable postfix
    

    设置日志系统

    选择systemd journald的日志系统, 而不是rsyslogd

    创建日志目录

    mkdir /var/log/journal # 持久化保存日志的目录
    mkdir /etc/systemd/journald.conf.d
    

    编写配置文件

    cat > /etc/systemd/journald.conf.d/99-prophet.conf <<EOF
    [Journal]
    # 持久化保存到磁盘
    Storage=persistent
    
    # 压缩历史日志
    Compress=yes
    
    SyncIntervalSec=5m
    RateLimitInterval=30s
    RateLimitBurst=1000
    
    # 最大占用空间 10G
    SystemMaxUse=10G
    
    # 单日志文件最大 200M
    SystemMaxFileSize=200M
    
    # 日志保存时间 2 周
    MaxRetentionSec=2week
    
    # 不将日志转发到syslog
    ForwardToSyslog=no
    EOF
    

    重启日志系统

    systemctl restart systemd-journald
    

    kube-proxy开启ipvs的前置条件

    # 加载br_netfilter模块
    modprobe br_netfilter
    
    # 编写依赖文件
    cat > /etc/sysconfig/modules/ipvs.modules <<EOF
    #!/bin/bash
    modprobe -- ip_vs
    modprobe -- ip_vs_rr
    modprobe -- ip_vs_wrr
    modprobe -- ip_vs_sh
    modprobe -- nf_conntrack_ipv4
    EOF
    
    # 授权
    chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
    

    安装Docker

    # 安装依赖
    yum install -y yum-utils device-mapper-persistent-data lvm2
    
    # 配置阿里源
    yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
    
    # 安装安装最新的 containerd.io
    dnf install https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.2.6-3.3.el7.x86_64.rpm
    
    # 安装docker
    yum update -y && yum install -y docker-ce
    
    # 查看docker版本(是否安装成功)
    docker --version
    
    # 创建 /etc/docker 目录
    mkdir /etc/docker
    
    # 配置 daemon.json
    cat > /etc/docker/daemon.json <<EOF
    {
      "registry-mirrors": ["https:xxxxx"] # 在阿里云控制台选择"容器镜像服务", 再选择"镜像加速器"侧边栏, 查看加速器地址
      "exec-opts": ["native.cgroupdriver=systemd"],
      "log-driver": "json-file",
      "log-opts": {
        "max-size": "100m"
      }
    }
    EOF
    
    # 创建目录
    mkdir-p /etc/systemd/system/docker.service.d
    
    # 重启docker
    systemctl daemon-reload && systemctl restart docker && systemctl enable docker
    

    安装Kubeadm(主从配置)

    下载kubeadm(三台服务器)

    # 配置阿里源
    cat <<EOF > /etc/yum.repos.d/kubernetes.repo
    [kubernetes]
    name=Kubernetes
    baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
    enabled=1
    gpgcheck=0
    repo_gpgcheck=0
    gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg # 注意两个网址在一行, 空格隔开
    EOF
    
    # 安装 kubelet kubeadm kubectl
    yum install -y kubelet kubeadm kubectl
    systemctl enable --now kubelet
    

    下载必须镜像(三台服务器)

    正常情况下, 接下来可以直接init操作, 在init操作时, 也会下载一些必须的组件镜像, 这些镜像是在k8s.gcr.io网站上下载的, 但是由于我们国内把该网址墙掉了, 不能直接访问, 于是需要先提前将这些镜像通过其他的方式下载好, 这里比较好的方式就是从另一个网站源下载.

    # 查看需要下载的镜像
    kubeadm config images list
    # 输出结果, 这些都是K8S的必要组件, 但是由于被墙, 是不能直接docker pull下来的
    k8s.gcr.io/kube-apiserver:v1.18.6
    k8s.gcr.io/kube-controller-manager:v1.18.6
    k8s.gcr.io/kube-scheduler:v1.18.6
    k8s.gcr.io/kube-proxy:v1.18.6
    k8s.gcr.io/pause:3.2
    k8s.gcr.io/etcd:3.4.3-0
    k8s.gcr.io/coredns:1.6.7
    # 直接pull的话会报错超时
    [ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.18.5: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    

    经过百度后, 发现这篇大佬的博客中第二个方法对我是管用的, 这里搬来用一用

    https://blog.csdn.net/weixin_43168190/article/details/107227626

    即先从gotok8s仓库下载镜像, 然后重新tag一下, 修改起名字即可, 这里使用大佬的脚本自动化执行全过程

    # 编写pull脚本
    vim pull_k8s_images.sh
    
    # 内容为
    set -o errexit
    set -o nounset
    set -o pipefail
    
    ##这里定义需要下载的版本
    KUBE_VERSION=v1.18.6
    KUBE_PAUSE_VERSION=3.2
    ETCD_VERSION=3.4.3-0
    DNS_VERSION=1.6.7
    
    ##这是原来被墙的仓库
    GCR_URL=k8s.gcr.io
    
    ##这里就是写你要使用的仓库,可以gotok8s不变
    DOCKERHUB_URL=gotok8s
    
    ##这里是镜像列表
    images=(
    kube-proxy:${KUBE_VERSION}
    kube-scheduler:${KUBE_VERSION}
    kube-controller-manager:${KUBE_VERSION}
    kube-apiserver:${KUBE_VERSION}
    pause:${KUBE_PAUSE_VERSION}
    etcd:${ETCD_VERSION}
    coredns:${DNS_VERSION}
    )
    
    ##这里是拉取和改名的循环语句, 先下载, 再tag重命名生成需要的镜像, 再删除下载的镜像
    for imageName in ${images[@]} ; do
      docker pull $DOCKERHUB_URL/$imageName
      docker tag $DOCKERHUB_URL/$imageName $GCR_URL/$imageName
      docker rmi $DOCKERHUB_URL/$imageName
    done
    
    # 赋予执行权限
    chmod +x ./pull_k8s_images.sh
    
    # 执行脚本
    ./pull_k8s_images.sh
    
    # 查看下载结果
    [root@k8s-master01-225 ~]# docker images
    REPOSITORY                           TAG                 IMAGE ID            CREATED             SIZE
    k8s.gcr.io/kube-proxy                v1.18.6             c3d62d6fe412        2 weeks ago         117MB
    k8s.gcr.io/kube-controller-manager   v1.18.6             ffce5e64d915        2 weeks ago         162MB
    k8s.gcr.io/kube-apiserver            v1.18.6             56acd67ea15a        2 weeks ago         173MB
    k8s.gcr.io/kube-scheduler            v1.18.6             0e0972b2b5d1        2 weeks ago         95.3MB
    k8s.gcr.io/pause                     3.2                 80d28bedfe5d        5 months ago        683kB
    k8s.gcr.io/coredns                   1.6.7               67da37a9a360        6 months ago        43.8MB
    gotok8s/kube-controller-manager      v1.17.0             5eb3b7486872        7 months ago        161MB
    k8s.gcr.io/etcd                      3.4.3-0             303ce5db0e90        9 months ago        288MB
    

    初始化主节点(只有主节点服务器才需要初始化)

    生成初始化文件

    kubeadm config print init-defaults > kubeadm-config.yaml
    

    修改初始化文件

    # 编辑文件
    vim kubeadm-config.yaml
    
    # 修改项下面标出
    apiVersion: kubeadm.k8s.io/v1beta2
    bootstrapTokens:
    - groups:
      - system:bootstrappers:kubeadm:default-node-token
      token: abcdef.0123456789abcdef
      ttl: 24h0m0s
      usages:
      - signing
      - authentication
    kind: InitConfiguration
    localAPIEndpoint:
      advertiseAddress: 172.19.199.225 # 1.修改IP地址, 使用私网IP地址即可
      bindPort: 6443
    nodeRegistration:
      criSocket: /var/run/dockershim.sock
      name: k8s-master01-225
      taints:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
    ---
    apiServer:
      timeoutForControlPlane: 4m0s
    apiVersion: kubeadm.k8s.io/v1beta2
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controllerManager: {}
    dns:
      type: CoreDNS
    etcd:
      local:
        dataDir: /var/lib/etcd
    imageRepository: k8s.gcr.io
    kind: ClusterConfiguration
    kubernetesVersion: v1.18.6  # 2.修改版本, 与前面版本一致, 也可通过 kubeadm version 查看版本
    networking:
      dnsDomain: cluster.local
      podSubnet: "10.244.0.0/16" # 3.新增pod子网, 固定该IP即可
      serviceSubnet: 10.96.0.0/12
    scheduler: {}
    # 4.新增下面设置, 固定即可
    ---
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    kind: KubeProxyConfiguration
    featureGates:
      SupportIPVSProxyMode: true
    mode: ipvs
    

    运行初始化命令

    kubeadm init --config=kubeadm-config.yaml | tee kubeadm-init.log
    
    # 正常运行结果
    ....
    Your Kubernetes control-plane has initialized successfully!
    
    To start using your cluster, you need to run the following as a regular user:
    
      mkdir -p $HOME/.kube
      sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
      sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    You should now deploy a pod network to the cluster.
    Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
      https://kubernetes.io/docs/concepts/cluster-administration/addons/
    
    Then you can join any number of worker nodes by running the following on each as root:
    
    kubeadm join 172.19.199.225:6443 --token abcdef.0123456789abcdef 
        --discovery-token-ca-cert-hash sha256:873f80617875dc39a23eced3464c7069689236d460b60692586e7898bf8a254a
    

    如果init运行错误

    可以根据错误信息来排错, 多半原因是配置文件kubeadm-config.yaml没写好, 如版本号没对上, IP地址没改, 多余空格等等...

    修改完之后之后, 如果直接运行init命令, 可能还会报错端口已被占用或者一些文件已经存在等

    [root@k8s-node01-228 ~]# kubeadm init --config=kubeadm-config.yaml | tee kubeadm-init.log
    W0801 18:35:22.768809   44882 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
    [init] Using Kubernetes version: v1.18.6
    [preflight] Running pre-flight checks
    	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
    	[WARNING FileExisting-tc]: tc not found in system path
    error execution phase preflight: [preflight] Some fatal errors occurred:
    	[ERROR Port-10259]: Port 10259 is in use
    	[ERROR Port-10257]: Port 10257 is in use
    	[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
    	[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
    	[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
    	[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
    	[ERROR Port-10250]: Port 10250 is in use
    [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
    To see the stack trace of this error execute with --v=5 or higher
    

    原因可能是之前init到一半成功了一部分, 但是报错后有没有回滚, 那么需要先运行kubeadm reset重新设置为init之前的状态

    [root@k8s-node01-228 ~]# kubeadm reset
    [reset] Reading configuration from the cluster...
    [reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
    W0801 18:57:02.630170   52554 reset.go:99] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: Get https://172.19.188.226:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s: context deadline exceeded
    [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
    [reset] Are you sure you want to proceed? [y/N]: y
    [preflight] Running pre-flight checks
    W0801 18:57:07.534409   52554 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
    [reset] Stopping the kubelet service
    [reset] Unmounting mounted directories in "/var/lib/kubelet"
    [reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
    [reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
    [reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]
    
    The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
    
    The reset process does not reset or clean up iptables rules or IPVS tables.
    If you wish to reset iptables, you must do so manually by using the "iptables" command.
    
    If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
    to reset your system's IPVS tables.
    
    The reset process does not clean your kubeconfig files and you must remove them manually.
    Please, check the contents of the $HOME/.kube/config file.
    

    重设完之后再继续执行上述的init即可, 知道init成功

    init运行成功后

    可以查看最后的输出结果或者查看运行日志kubeadm-init.log, 里面告诉说需要操作下面的步骤

    mkdir -p $HOME/.kube
    cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    chown $(id -u):$(id -g) $HOME/.kube/config
    

    查看当前节点, 发现状态为NotReady

    [root@k8s-master01-225 ~]# kubectl get node
    NAME               STATUS     ROLES    AGE   VERSION
    k8s-master01-225   NotReady   master   40m   v1.18.6
    

    部署flannel网络(主节点服务器)

    可以先整理一下当前文件夹

    # 创建整理安装所需的文件夹
    [root@k8s-master01-225 ~]# mkdir -p install-k8s/core
    
    # 将主要的文件放入文件夹中
    [root@k8s-master01-225 ~]# mv kubeadm-init.log kubeadm-config.yaml install-k8s/core
    
    # 创建flannel文件夹
    [root@k8s-master01-225 ~]# cd install-k8s
    [root@k8s-master01-225 install-k8s]# mkdir plugin
    [root@k8s-master01-225 install-k8s]# cd plugin/
    [root@k8s-master01-225 plugin]# mkdir flannel
    [root@k8s-master01-225 plugin]# cd flannel/
    
    # 下载kube-flannel.yml文件
    [root@k8s-master01-225 flannel]# wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    # 下载命令的打印结果
    --2020-08-01 19:23:44--  https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.108.133
    Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.108.133|:443... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 14366 (14K) [text/plain]
    Saving to: ‘kube-flannel.yml’
    kube-flannel.yml              100%[================================================>]  14.03K  --.-KB/s    in 0.05s   
    2020-08-01 19:23:44 (286 KB/s) - ‘kube-flannel.yml’ saved [14366/14366]
    
    # 创建flannel
    [root@k8s-master01-225 flannel]# kubectl create -f kube-flannel.yml
    # 创建命令的打印结果
    podsecuritypolicy.policy/psp.flannel.unprivileged created
    clusterrole.rbac.authorization.k8s.io/flannel created
    clusterrolebinding.rbac.authorization.k8s.io/flannel created
    serviceaccount/flannel created
    configmap/kube-flannel-cfg created
    daemonset.apps/kube-flannel-ds-amd64 created
    daemonset.apps/kube-flannel-ds-arm64 created
    daemonset.apps/kube-flannel-ds-arm created
    daemonset.apps/kube-flannel-ds-ppc64le created
    daemonset.apps/kube-flannel-ds-s390x created
    
    # 查看pod, 可以看到flannel组件已经运行起来了. 默认系统组件都安装在 kube-system 这个命名空间(namespace)下
    [root@k8s-master01-225 flannel]# kubectl get pod -n kube-system
    NAME                                       READY   STATUS    RESTARTS   AGE
    coredns-66bff467f8-tlqdw                   1/1     Running   0          18m
    coredns-66bff467f8-zpg4q                   1/1     Running   0          18m
    etcd-k8s-master01-225                      1/1     Running   0          18m
    kube-apiserver-k8s-master01-225            1/1     Running   0          18m
    kube-controller-manager-k8s-master01-225   1/1     Running   0          18m
    kube-flannel-ds-amd64-5hpff                1/1     Running   0          32s
    kube-proxy-xh6wh                           1/1     Running   0          18m
    kube-scheduler-k8s-master01-225            1/1     Running   0          18m
    
    # 再次查看node, 发现状态已经变成了 Ready
    [root@k8s-master01-225 flannel]# kubectl get node
    NAME               STATUS   ROLES    AGE   VERSION
    k8s-master01-225   Ready    master   19m   v1.18.6
    

    将子节点加到主节点下面(在子节点服务器运行)

    还是在主节点的init命令的输出日志下, 有子节点的加入命令, 在两台子节点服务器上运行

    kubeadm join 172.19.199.225:6443 --token abcdef.0123456789abcdef 
        --discovery-token-ca-cert-hash sha256:23816230102e09bf09766f14896828f7b377d0b3aa44e619342cbdf47ccd37b5
    

    稍等片刻后, 加入成功如下:

    W0801 19:27:06.500319   12557 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
    [preflight] Running pre-flight checks
    	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
    	[WARNING FileExisting-tc]: tc not found in system path
    [preflight] Reading configuration from the cluster...
    [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
    [kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
    [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
    [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
    [kubelet-start] Starting the kubelet
    [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
    
    This node has joined the cluster:
    * Certificate signing request was sent to apiserver and a response was received.
    * The Kubelet was informed of the new secure connection details.
    
    Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
    

    在主节点服务器上查看子节点状态为Ready

    [root@k8s-master01-225 flannel]# kubectl get node
    NAME               STATUS   ROLES    AGE   VERSION
    k8s-master01-225   Ready    master   20m   v1.18.6
    k8s-node01-228     Ready    <none>   34s   v1.18.6
    k8s-node02-229     Ready    <none>   29s   v1.18.6
    

    但是在子节点服务器上运行kubectl get node却发现报错了, 如下

    (root@k8s-node02-229:~)# kubectl get node
    The connection to the server localhost:8080 was refused - did you specify the right host or port?
    

    经百度后发现按安装成功日志提示的如下步骤操作即可

    # 在各个子节点创建.kube目录
    (root@k8s-node02-229:~)# mkdir -p $HOME/.kube
    # 这里需要在主节点将admin.conf复制到各个子节点
    scp /etc/kubernetes/admin.conf root@k8s-node01-228:$HOME/.kube/config
    scp /etc/kubernetes/admin.conf root@k8s-node02-229:$HOME/.kube/config
    # 授权
    (root@k8s-node02-229:~)# chown $(id -u):$(id -g) $HOME/.kube/config
    # 最后运行测试, 发现不报错了
    (root@k8s-node02-229:~)# kubectl get node
    NAME               STATUS   ROLES    AGE   VERSION
    k8s-master01-225   Ready    master   37h   v1.18.6
    k8s-node01-228     Ready    <none>   36h   v1.18.6
    k8s-node02-229     Ready    <none>   36h   v1.18.6
    

    解决pod的IP无法ping通的问题

    集群安装完成后, 启动一个pod

    # 启动pod, 命名为nginx-offi, 里面运行的容器为从官网拉取的Nginx镜像
    (root@k8s-master01-225:~)# kubectl run nginx-offi --image=nginx
    pod/nginx-offi created
    # 查看pod的运行信息, 可以看到状态为 "Running" ,IP为 "10.244.1.7", 运行在了 "k8s-node01-228" 节点上
    (root@k8s-master01-225:~)# kubectl get pod -o wide
    NAME                        READY   STATUS    RESTARTS   AGE   IP           NODE             NOMINATED NODE   READINESS GATES
    nginx-offi                  1/1     Running   0          55s   10.244.1.7   k8s-node01-228   <none>           <none>
    

    但是如果在主节点k8s-master01-225或者另一个子节点 k8s-node02-229上访问刚才运行的pod, 却发现访问不到, ping该IP地址10.244.1.7也ping不通, 尽管前面我们已经安装好了flannel.

    经过百度后发现, 是因为 iptables 规则的问题, 前面我们在初始化服务器设置的时候清除了iptables的规则, 但是不知道是不是因为安装了 flannel 还是哪一步的问题, 会导致 iptables 里面又多出了规则

    # 查看iptables
    (root@k8s-master01-225:~)# iptables -L -n
    Chain INPUT (policy ACCEPT)
    target     prot opt source               destination         
    KUBE-FIREWALL  all  --  0.0.0.0/0            0.0.0.0/0           
    
    Chain FORWARD (policy ACCEPT)
    target     prot opt source               destination         
    KUBE-FORWARD  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */
    DOCKER-USER  all  --  0.0.0.0/0            0.0.0.0/0           
    DOCKER-ISOLATION-STAGE-1  all  --  0.0.0.0/0            0.0.0.0/0           
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    DOCKER     all  --  0.0.0.0/0            0.0.0.0/0           
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
    
    Chain OUTPUT (policy ACCEPT)
    target     prot opt source               destination         
    KUBE-FIREWALL  all  --  0.0.0.0/0            0.0.0.0/0           
    
    Chain DOCKER (1 references)
    target     prot opt source               destination         
    
    Chain DOCKER-ISOLATION-STAGE-1 (1 references)
    target     prot opt source               destination         
    DOCKER-ISOLATION-STAGE-2  all  --  0.0.0.0/0            0.0.0.0/0           
    RETURN     all  --  0.0.0.0/0            0.0.0.0/0           
    
    Chain DOCKER-ISOLATION-STAGE-2 (1 references)
    target     prot opt source               destination         
    DROP       all  --  0.0.0.0/0            0.0.0.0/0           
    RETURN     all  --  0.0.0.0/0            0.0.0.0/0           
    
    Chain DOCKER-USER (1 references)
    target     prot opt source               destination         
    RETURN     all  --  0.0.0.0/0            0.0.0.0/0           
    
    Chain KUBE-FIREWALL (2 references)
    target     prot opt source               destination         
    DROP       all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000
    DROP       all  -- !127.0.0.0/8          127.0.0.0/8          /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT
    
    Chain KUBE-KUBELET-CANARY (0 references)
    target     prot opt source               destination         
    
    Chain KUBE-FORWARD (1 references)
    target     prot opt source               destination         
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */ mark match 0x4000/0x4000
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED
    # Warning: iptables-legacy tables present, use iptables-legacy to see them
    

    我们需要再次清空iptables规则

    iptables -F &&  iptables -X &&  iptables -F -t nat &&  iptables -X -t nat
    

    再次查看iptables

    (root@k8s-master01-225:~)# iptables -L -n
    Chain INPUT (policy ACCEPT)
    target     prot opt source               destination         
    
    Chain FORWARD (policy ACCEPT)
    target     prot opt source               destination         
    KUBE-FORWARD  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */
    
    Chain OUTPUT (policy ACCEPT)
    target     prot opt source               destination         
    
    Chain KUBE-FORWARD (1 references)
    target     prot opt source               destination         
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */ mark match 0x4000/0x4000
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED
    # Warning: iptables-legacy tables present, use iptables-legacy to see them
    

    再次ping或者访问pod, 即可成功

    (root@k8s-master01-225:~)# curl 10.244.1.7
    <!DOCTYPE html>
    <html>
    <head>
    <title>Welcome to nginx!</title>
    <style>
        body {
             35em;
            margin: 0 auto;
            font-family: Tahoma, Verdana, Arial, sans-serif;
        }
    </style>
    </head>
    <body>
    <h1>Welcome to nginx!</h1>
    <p>If you see this page, the nginx web server is successfully installed and
    working. Further configuration is required.</p>
    
    <p>For online documentation and support please refer to
    <a href="http://nginx.org/">nginx.org</a>.<br/>
    Commercial support is available at
    <a href="http://nginx.com/">nginx.com</a>.</p>
    
    <p><em>Thank you for using nginx.</em></p>
    </body>
    </html>
    

    安装私有仓库harbor

    Harbor是一个用于存储和分发Docker镜像的企业级Registry服务器,可以用来构建企业内部的Docker镜像仓库。

    harbor是基于docker registry进行了相应的企业级扩展,从而获得了更加广泛的应用,新特性包括:

    • 管理用户界面
    • 基于角色的访问控制
    • AD/LDAP集成
    • 审计日志等

    相比于原生的docker registry, 更加方便管理企业量级的容器, 并且通过内网搭建的传输效率也是非常高的

    前置条件

    • python应该是2.7或更高版本

    • Docker引擎应为1.10或更高版本

    • Docker Compose需要为1.6.0或更高版本

    安装Docker-compose

    官网安装教程: https://docs.docker.com/compose/install/

    下载最新的安装包, 到/usr/local/bin/docker-compose目录

    sudo curl -L "https://github.com/docker/compose/releases/download/1.26.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
    

    授权

    sudo chmod +x /usr/local/bin/docker-compose
    

    创建软连接

    sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
    

    测试安装结果

    # docker-compose --version
    docker-compose version 1.26.2, build eefe0d31
    

    下载harbor

    官网下载地址: https://github.com/vmware/harbor/releases

    • 选择最新发布的版本: v1.10.4

    • 下载600多兆的线下版本(这样便于后续安装): harbor-offline-installer-v1.10.4.tgz

    wget https://github.com/goharbor/harbor/releases/download/v1.10.4/harbor-offline-installer-v1.10.4.tgz
    

    解压至自定义的目录, 这里放在/usr/local下

    tar xvf harbor-offline-installer-v1.10.4.tgz -C /usr/local/
    # 重命名并创建软连接(推荐使用, 便于后续升级管理的常用方式)
    cd /usr/local/
    (root@Aliyun-Alex:/usr/local)# mv harbor/ harbor-v1.10.4
    (root@Aliyun-Alex:/usr/local)# ln -s /usr/local/harbor-v1.10.4/ /usr/local/harbor
    (root@Aliyun-Alex:/usr/local)# cd harbor
    (root@Aliyun-Alex:/usr/local/harbor)# ls
    common.sh  harbor.v1.10.4.tar.gz  harbor.yml  install.sh  LICENSE  prepare
    

    修改安装配置文件harbor.yml

    # vim harbor.yml
    # 1. 修改主机名, 可以是IP或者域名, 用来进入管理UI界面和仓库服务的
    # 这里我随便使用一个域名 alex.gcx.com, 然后在本机Windows10电脑的hosts中添加设置: alex.gcx.com 阿里云公网IP
    # hosts文件其实就是一个dns的作用, 在浏览器中输入域名后, 会找到其对应的IP地址
    hostname: alex.gcx.com
    
    # 2. harbor提供了http和https两种协议方式访问harbor服务, 以前版本默认使用http协议, 现在默认使用https协议, 
    # http 协议, 正如下面官网注释所说, 如果https服务是可用的, 那么就算访问的是http的端口, 也会重定向到https的端口上
    # 将原来的80端口改为8002(自定义)端口, 之所以改80端口因为一般来说80端口都是给Nginx用的, 可以先查看端口是否被占用 netstat -anp |grep 8002
    # http related config                                                           
    http:                                                                           
      # port for http, default is 80. If https enabled, this port will redirect to https port
      port: 8002                                                                    
    # https 协议, 如果不想用https协议, 就可以把下面的设置注释掉, 我两种方式都有尝试, https比较麻烦的一点就是需要创建授权证书
    # 若证书创建好了就可以在下面配置证书信息, 创建https证书的步骤下面会介绍
    # https related config                                                          
    # https:                                                                        
     # https port for harbor, default is 443                                       
     # port: 443                                                                   
     # The path of cert and key files for nginx                                    
     # certificate: /data/cert/server.crt                                          
     # private_key: /data/cert/server.key
    
    # 3. (可选)登录harbor管理界面的用户 admin 的登录密码
    harbor_admin_password: your_password
    
    # 4. (可选)修改数据卷目录和容器目录
    data_volume: /data/harbor
    location: /data/harbor/logs
    

    创建https证书(可选)

    创建密钥, 使用openssl工具生成一个RSA私钥

    (root@Aliyun-Alex:~)# openssl genrsa -des3 -out server.key 2048
    # 输入两次自定义的密码
    Generating RSA private key, 2048 bit long modulus (2 primes)
    ...........+++++
    ...........................+++++
    e is 65537 (0x010001)
    Enter pass phrase for server.key:
    Verifying - Enter pass phrase for server.key:
    (root@Aliyun-Alex:~)# ls
    server.key
    

    生成CSR(证书签名请求), 输入的信息可以随意输入, 这里只是随便做一个虚拟的证书, 如果是真实的证书需要将证书发送给证书颁发机构(CA),CA验证过请求者的身份之后,会出具签名证书,需要花钱。

    (root@Aliyun-Alex:~)# openssl req -new -key server.key -out server.csr
    Enter pass phrase for server.key:
    You are about to be asked to enter information that will be incorporated
    into your certificate request.
    What you are about to enter is what is called a Distinguished Name or a DN.
    There are quite a few fields but you can leave some blank
    For some fields there will be a default value,
    If you enter '.', the field will be left blank.
    -----
    Country Name (2 letter code) [XX]:CN # 
    State or Province Name (full name) []:SH
    Locality Name (eg, city) [Default City]:SH
    Organization Name (eg, company) [Default Company Ltd]:
    Organizational Unit Name (eg, section) []:
    Common Name (eg, your name or your server's hostname) []:alex.gcx.com
    Email Address []:111@163.com
    
    Please enter the following 'extra' attributes
    to be sent with your certificate request
    A challenge password []:
    An optional company name []:
    (root@Aliyun-Alex:~)# ls
    3000  dump.rdb  server.csr  server.key
    

    删除密钥中的密码, 如果不删除密码,在应用加载的时候会出现输入密码进行验证的情况,不方便自动化部署。

    # 备份证书
    (root@Aliyun-Alex:~)# cp server.key server.key.back
    # 删除密码
    (root@Aliyun-Alex:~)# openssl rsa -in server.key -out server.key
    Enter pass phrase for server.key:
    writing RSA key
    

    生成自签名证书

    (root@Aliyun-Alex:~)# openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt
    Signature ok
    subject=C = CN, ST = SH, L = SH, O = Default Company Ltd, CN = alex, emailAddress = 111@163.com
    Getting Private key
    

    生成pem格式的公钥(可选), 有些服务,需要有pem格式的证书才能正常加载,可以用下面的命令:

    openssl x509 -in server.crt -out server.pem -outform PEM
    

    创建证书目录

    # 创建目录
    (root@Aliyun-Alex:~)# mkdir -p /data/cert
    
    # 将证书相关文件移动至证书目录
    (root@Aliyun-Alex:~)# mv server.* /data/cert/
    (root@Aliyun-Alex:~)# cd /data/cert/
    (root@Aliyun-Alex:/data/cert)# ls
    server.crt  server.csr  server.key  server.key.back
    
    # 授权
    chmod -R 777 /data/cert
    

    修改harbor.yml中证书路径配置

    # vim /usr/local/harbor-v1.10.4/harbor.yml
    certificate: /data/cert/server.crt
    private_key: /data/cert/server.key
    

    运行脚本安装harhor

    (root@Aliyun-Alex:~)# sh /usr/local/harbor/install.sh
    [Step 0]: checking if docker is installed ...
    
    Note: docker version: 19.03.12
    
    [Step 1]: checking docker-compose is installed ...
    
    Note: docker-compose version: 1.26.2
    
    [Step 2]: loading Harbor images ...
    ...
    [Step 5]: starting Harbor ...
    Creating network "harbor-v1104_harbor" with the default driver
    Creating harbor-log ... done
    Creating registry      ... done
    Creating harbor-portal ... done
    Creating redis         ... done
    Creating registryctl   ... done
    Creating harbor-db     ... done
    Creating harbor-core   ... done
    Creating nginx             ... done
    Creating harbor-jobservice ... done
    ✔ ----Harbor has been installed and started successfully.----
    

    登录网站查看harbor的管理页面

    http://alex.gcx.com:8002

    image-20200802172941369

    在终端中登录harbor

    (root@Aliyun-Alex:/usr/local/harbor)# docker login alex.gcx.com
    Username: admin
    Password: 
    Error response from daemon: Get https://alex.gcx.com/v2/: x509: certificate signed by unknown authority
    

    发现登录报错, 这是因为还是和上面一样, 重定向到了https的地址, 需要证书认证, 但是我们的证书是虚拟的, docker客户端认为证书是不安全的, 所以会报错, 那么这里我们需要修改一下docker的配置文件/etc/docker/daemon.json

    vim /etc/docker/daemon.json
    # 在里面添上一句话(显示时可能不会显示双引号)
    # 告诉docker客户端这个域名可以访问
    "insecure-registries": ["https://alex.gcx.com"]
    
    # 重启docker
    systemctl restart docker
    
    # 再次登录发现可以成功
    (root@Aliyun-Alex:/usr/local)# docker login alex.gcx.com
    Username: admin      
    Password: 
    WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
    Configure a credential helper to remove this warning. See
    https://docs.docker.com/engine/reference/commandline/login/#credentials-store
    
    Login Succeeded
    

    其他服务器访问harbor需要修改的地方

    # 1.添加hosts
    echo "172.19.67.12 alex.gcx.com" >> /etc/hosts
    
    # 2.添加/etc/docker/daemon.json
    "insecure-registries": ["https://alex.gcx.com"]
    # 3.重启docker
    systemctl restart docker
    

    运维操作-启停harbor

    若想要修改harbor配置, 如这里想启用https协议, 步骤为

    # 进入harbor目录
    (root@Aliyun-Alex:~)# cd /usr/local/harbor
    (root@Aliyun-Alex:/usr/local/harbor)# ls
    common  common.sh  docker-compose.yml  harbor.v1.10.4.tar.gz  harbor.yml  install.sh  LICENSE  prepare
    
    # 关闭harbor服务(docker-compose)
    (root@Aliyun-Alex:/usr/local/harbor)# docker-compose down -v
    Stopping harbor-jobservice ... done
    Stopping nginx             ... done
    Stopping harbor-core       ... done
    Stopping harbor-portal     ... done
    Stopping harbor-db         ... done
    Stopping redis             ... done
    Stopping registryctl       ... done
    Stopping registry          ... done
    Stopping harbor-log        ... done
    Removing harbor-jobservice ... done
    Removing nginx             ... done
    Removing harbor-core       ... done
    Removing harbor-portal     ... done
    Removing harbor-db         ... done
    Removing redis             ... done
    Removing registryctl       ... done
    Removing registry          ... done
    Removing harbor-log        ... done
    Removing network harbor-v1104_harbor
    
    # 编辑harbor.yml, 修改https设置
    (root@Aliyun-Alex:/usr/local/harbor)# vim harbor.yml
    # https related config                                                          
    https:
      # https port for harbor, default is 443                                       
      port: 443                                                                     
      # The path of cert and key files for nginx                                    
      certificate: /data/cert/server.crt                                            
      private_key: /data/cert/server.key
      
    # 执行启动前准备
    (root@Aliyun-Alex:/usr/local/harbor)# ./prepare
    prepare base dir is set to /usr/local/harbor-v1.10.4
    Clearing the configuration file: /config/log/logrotate.conf
    Clearing the configuration file: /config/log/rsyslog_docker.conf
    Clearing the configuration file: /config/nginx/nginx.conf
    Clearing the configuration file: /config/core/env
    Clearing the configuration file: /config/core/app.conf
    Clearing the configuration file: /config/registry/config.yml
    Clearing the configuration file: /config/registry/root.crt
    Clearing the configuration file: /config/registryctl/env
    Clearing the configuration file: /config/registryctl/config.yml
    Clearing the configuration file: /config/db/env
    Clearing the configuration file: /config/jobservice/env
    Clearing the configuration file: /config/jobservice/config.yml
    Generated configuration file: /config/log/logrotate.conf
    Generated configuration file: /config/log/rsyslog_docker.conf
    Generated configuration file: /config/nginx/nginx.conf
    Generated configuration file: /config/core/env
    Generated configuration file: /config/core/app.conf
    Generated configuration file: /config/registry/config.yml
    Generated configuration file: /config/registryctl/env
    Generated configuration file: /config/db/env
    Generated configuration file: /config/jobservice/env
    Generated configuration file: /config/jobservice/config.yml
    loaded secret from file: /secret/keys/secretkey
    Generated configuration file: /compose_location/docker-compose.yml
    Clean up the input dir
    
    # 启动docker-compose
    (root@Aliyun-Alex:/usr/local/harbor)# docker-compose up -d
    Creating network "harbor-v1104_harbor" with the default driver
    Creating harbor-log ... done
    Creating redis         ... done
    Creating registry      ... done
    Creating harbor-db     ... done
    Creating registryctl   ... done
    Creating harbor-portal ... done
    Creating harbor-core   ... done
    Creating harbor-jobservice ... done
    Creating nginx             ... done
    

    浏览器中再次访问http的网址: http://alex.gcx.com:8002, 发现其重定向为https的网址了

    image-20200802173828807

  • 相关阅读:
    面试题47题
    深度学习面试
    神经网络训练中的梯度消失与梯度爆炸
    Softmax函数与交叉熵
    sourceTree 添加 ssh key 方法
    request.form()和request()的区别
    C#中Request.ServerVariables详细说明及代理
    Page_Load事件与IsPostBack属性
    CSS中position的absolute和relative用法
    读取游标
  • 原文地址:https://www.cnblogs.com/gcxblogs/p/13710023.html
Copyright © 2011-2022 走看看