主要参考https://github.com/opsnull/follow-me-install-kubernetes-cluster,采用Flanel和docker
系统信息
角色 | 系统 | CPU Core | 内存 | 主机名称 | ip | 安装组件 |
---|---|---|---|---|---|---|
master | 18.04.1-Ubuntu | 4 | 8G | master | 192.168.0.107 | kubectl,kube-apiserver,kube-controller-manager,kube-scheduler,etcd,flannald |
slave | 18.04.1-Ubuntu | 4 | 4G | slave | 192.168.0.114 | docker,flannald,kubelet,kube-proxy,coredns |
k8s&docker版本
软件 | 版本 |
---|---|
k8s | 1.17.2 |
etcd | v3.3.18 |
coredns | 1.6.6(docker镜像) |
Flanel | v0.11.0 |
docker | 18.09 |
安装前准备(主节点和从节点都需要执行)
-
关闭swap
sudo swapoff -a sudo sed -i '/ swap / s/^(.*)$/#1/g' /etc/fstab
-
配置常用软件安装源
在/etc/apt/sources.list.d/ 追加system.list文件,内容如下deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted deb http://mirrors.aliyun.com/ubuntu/ bionic universe deb http://mirrors.aliyun.com/ubuntu/ bionic-updates universe deb http://mirrors.aliyun.com/ubuntu/ bionic multiverse deb http://mirrors.aliyun.com/ubuntu/ bionic-updates multiverse deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
执行
sudo apt-get update
-
创建工作目录
mkdir -p /opt/k8s/{bin,work} /etc/{kubernetes,etcd}/cert
-
将 /opt/k8s/bin追加到$PATH中
echo 'PATH=/opt/k8s/bin:$PATH' >>/root/.bashrc source /root/.bashrc
-
安装ssh服务,并设置root可以执行
apt install openssh-server #编辑/etc/ssh/sshd_config文件,在#PermitRootLogin prohibit-password下追加PermitRootLogin yes ,重启ssh服务 systemctl restart ssh.service
-
安装依赖工具包
apt install -y ipvsadm ipset curl jq
-
设置主机名
cat >> /etc/hosts <<EOF 192.168.0.107 master 192.168.0.114 slave EOF
-
添加节点信任关系,只用在master节点上执行
ssh-keygen -t rsa ssh-copy-id root@192.168.0.114
创建CA根证书和秘钥(在master节点上执行)
-
安装cfssl工具集
cd /opt/k8s/work wget https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssl_1.4.1_linux_amd64 cp cfssl_1.4.1_linux_amd64 /opt/k8s/bin/cfssl wget https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssljson_1.4.1_linux_amd64 cp cfssljson_1.4.1_linux_amd64 /opt/k8s/bin/cfssljson wget https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssl-certinfo_1.4.1_linux_amd64 cp cfssl-certinfo_1.4.1_linux_amd64 /opt/k8s/bin/cfssl-certinfo chmod +x /opt/k8s/bin/*
-
创建CA配置文件
cd /opt/k8s/work cat > ca-config.json <<EOF { "signing": { "default": { "expiry": "87600h" }, "profiles": { "kubernetes": { "usages": [ "signing", "key encipherment", "server auth", "client auth" ], "expiry": "87600h" } } } } EOF
- signing:表示该证书可用于签名其它证书(生成的 ca.pem 证书中 CA=TRUE);
- server auth:表示 client 可以用该该证书对 server 提供的证书进行验证;
- client auth:表示 server 可以用该该证书对 client 提供的证书进行验证;
- expiry : "87600h":证书有效期设置为 10 年;
-
创建证书签名请求文件
cd /opt/k8s/work cat > ca-csr.json <<EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "NanJing", "L": "NanJing", "O": "k8s", "OU": "system" } ], "ca": { "expiry": "87600h" } } EOF
-
生成证书
cd /opt/k8s/work cfssl gencert -initca ca-csr.json | cfssljson -bare ca ls ca*
-
安装证书
cd /opt/k8s/work cp ca*.pem ca-config.json /etc/kubernetes/cert # 分发到从节点 export node_ip=192.168.0.114 scp ca*.pem ca-config.json root@${node_ip}:/etc/kubernetes/cert/
部署 etcd(在master节点上执行)
-
下载安装etcd
cd /opt/k8s/work wget https://github.com/etcd-io/etcd/releases/download/v3.3.18/etcd-v3.3.18-linux-amd64.tar.gz tar -xvf etcd-v3.3.18-linux-amd64.tar.gz
-
安装etcd
cd /opt/k8s/work cp etcd-v3.3.18-linux-amd64/etcd* /opt/k8s/bin/ chmod +x /opt/k8s/bin/*
-
创建 etcd 证书和私钥
-
创建证书签名请求文件
cd /opt/k8s/work cat > etcd-csr.json <<EOF { "CN": "etcd", "hosts": [ "127.0.0.1", "192.168.0.107" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "NanJing", "L": "NanJing", "O": "k8s", "OU": "system" } ] } EOF
- 指定授权使用该证书的 etcd 节点 IP 列表
-
生成证书和私钥
cd /opt/k8s/work cfssl gencert -ca=/opt/k8s/work/ca.pem -ca-key=/opt/k8s/work/ca-key.pem -config=/opt/k8s/work/ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd ls etcd*pem
-
安装证书
cd /opt/k8s/work cp etcd*.pem /etc/etcd/cert/
-
-
创建etcd启动文件
cat> /etc/systemd/system/etcd.service<< EOF [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos [Service] Type=notify WorkingDirectory=/data/k8s/etcd/data ExecStart=/opt/k8s/bin/etcd \ --data-dir=/etc/etcd/cfg/etcd \ --name=etcd-chengf \ --cert-file=/etc/etcd/cert/etcd.pem \ --key-file=/etc/etcd/cert/etcd-key.pem \ --trusted-ca-file=/etc/kubernetes/cert/ca.pem \ --peer-cert-file=/etc/etcd/cert/etcd.pem \ --peer-key-file=/etc/etcd/cert/etcd-key.pem \ --peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem \ --peer-client-cert-auth \ --client-cert-auth \ --listen-peer-urls=https://192.168.0.107:2380 \ --initial-advertise-peer-urls=https://192.168.0.107:2380 \ --listen-client-urls=https://192.168.0.107:2379,http://127.0.0.1:2379 \ --advertise-client-urls=https://192.168.0.107:2379 \ --initial-cluster-token=etcd-cluster-0\ --initial-cluster=etcd-chengf=https://192.168.0.107:2380 \ --initial-cluster-state=new \ --auto-compaction-mode=periodic \ --auto-compaction-retention=1 \ --max-request-bytes=33554432 \ --quota-backend-bytes=6442450944 \ --heartbeat-interval=250 \ --election-timeout=2000 Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
- WorkingDirectory、--data-dir:指定工作目录和数据目录,需在启动服务前创建这个目录;
- --name:指定节点名称,当 --initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster 列表中;
- --cert-file、--key-file:etcd server 与 client 通信时使用的证书和私钥;
- --trusted-ca-file:签名 client 证书的 CA 证书,用于验证 client 证书;
- --peer-cert-file、--peer-key-file:etcd 与 peer 通信使用的证书和私钥;
- --peer-trusted-ca-file:签名 peer 证书的 CA 证书,用于验证 peer 证书;
-
创建etcd数据目录
mkdir -p /data/k8s/etcd/data
-
启动 etcd 服务
systemctl enable etcd && systemctl start etcd
-
检查启动结果
systemctl status etcd|grep Active
-
确保状态为 active (running),否则查看日志,确认原因
-
如果出现异常,通过如下命令查看
journalctl -u etcd
-
-
验证服务状态
export ETCD_ENDPOINTS=https://192.168.0.107:2379 etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem cluster-health
etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem member list
输出结果
root@master:/opt/k8s/work# etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem cluster-health
member c0d3b56a9878e38f is healthy: got healthy result from https://192.168.0.107:2379
cluster is healthy
root@master:/opt/k8s/work# etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pemmember list
c0d3b56a9878e38f: name=etcd-chengf peerURLs=https://192.168.0.107:2380 clientURLs=https://192.168.0.107:2379 isLeader=true
```
部署 flannel 网络(在master节点上执行)
kubernetes组件kubelet服务依赖docker服务,docker网络需要用flannel来配置docker0网桥的ip地址,所以需要先安装flannel网络组建
flannel 使用 vxlan 技术为各节点创建一个可以互通的 Pod 网络,使用的端口为 UDP 8472(需要开放该端口,如公有云 AWS 等)。
flanneld 第一次启动时,从 etcd 获取配置的 Pod 网段信息,为本节点分配一个未使用的地址段,然后创建 flannedl.1 网络接口(也可能是其它名称,如 flannel1 等)。
flannel 将分配给自己的 Pod 网段信息写入 /run/flannel/docker 文件,docker 后续使用这个文件中的环境变量设置 docker0 网桥,从而从这个地址段为本节点的所有 Pod 容器分配 IP
-
下载和安装flanneld 二进制文件
cd /opt/k8s/work mkdir flannel wget https://github.com/coreos/flannel/releases/download/v0.11.0/flannel-v0.11.0-linux-amd64.tar.gz tar -xzvf flannel-v0.11.0-linux-amd64.tar.gz -C flannel cp flannel/{flanneld,mk-docker-opts.sh} /opt/k8s/bin/ export node_ip=192.168.0.114 scp flannel/{flanneld,mk-docker-opts.sh} root@${192.168.0.114}:/opt/k8s/bin/
-
创建 flanneld 证书和私钥
flanneld 从 etcd 集群存取网段分配信息,而 etcd 集群启用了双向 x509 证书认证,所以需要为 flanneld 生成证书和私钥。
-
创建证书签名请求
cd /opt/k8s/work cat > flanneld-csr.json <<EOF { "CN": "flanneld", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "NanJing", "L": "NanJing", "O": "k8s", "OU": "system" } ] } EOF
-
生成证书和私钥
cfssl gencert -ca=/opt/k8s/work/ca.pem -ca-key=/opt/k8s/work/ca-key.pem -config=/opt/k8s/work/ca-config.json -profile=kubernetes flanneld-csr.json | cfssljson -bare flanneld ls flanneld*pem
-
将生成的证书和私钥分发到所有节点
cd /opt/k8s/work mkdir -p /etc/flanneld/cert cp flanneld*.pem /etc/flanneld/cert export node_ip=192.168.0.114 ssh root@${node_ip} "mkdir -p /etc/flanneld/cert" scp flanneld*.pem root@${node_ip}:/etc/flanneld/cert
-
-
向 etcd 写入集群 Pod 网段信息
cd /opt/k8s/work export FLANNEL_ETCD_PREFIX="/kubernetes/network" export ETCD_ENDPOINTS="https://192.168.0.107:2379" etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/opt/k8s/work/ca.pem --cert-file=/opt/k8s/work/flanneld.pem --key-file=/opt/k8s/work/flanneld-key.pem mk ${FLANNEL_ETCD_PREFIX}/config '{"Network":"172.30.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}'
- 写入的 Pod 网段 Network 网络段对应的数值(如 /16)必须小于 SubnetLen对应的值(如24)
-
创建 flanneld 服务的启动文件
cd /opt/k8s/work export FLANNEL_ETCD_PREFIX="/kubernetes/network" export ETCD_ENDPOINTS="https://192.168.0.107:2379" cat > flanneld.service << EOF [Unit] Description=Flanneld overlay address etcd agent After=network.target After=network-online.target Wants=network-online.target After=etcd.service Before=docker.service [Service] Type=notify ExecStart=/opt/k8s/bin/flanneld \ -etcd-cafile=/etc/kubernetes/cert/ca.pem \ -etcd-certfile=/etc/flanneld/cert/flanneld.pem \ -etcd-keyfile=/etc/flanneld/cert/flanneld-key.pem \ -etcd-endpoints=${ETCD_ENDPOINTS} \ -etcd-prefix=${FLANNEL_ETCD_PREFIX} \ -ip-masq ExecStartPost=/opt/k8s/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker Restart=always RestartSec=5 StartLimitInterval=0 [Install] WantedBy=multi-user.target RequiredBy=docker.service EOF
- mk-docker-opts.sh 脚本将分配给 flanneld 的 Pod 子网段信息,通过-d参数写入 /run/flannel/docker 文件,后续 docker 启动时使用这个文件中的环境变量配置 docker0 网桥, -k 参数控制生成文件中变量的名称,下面docker启动时会用到这个变量;
- flanneld 使用系统缺省路由所在的接口与其它节点通信,对于有多个网络接口(如内网和公网)的节点,可以用 -iface 参数指定通信接口;
- -ip-masq: flanneld 为访问 Pod 网络外的流量设置 SNAT 规则,同时将传递给 Docker 的变量 --ip-masq(/run/flannel/docker 文件中)设置为 false,这样 Docker 将不再创建 SNAT 规则; Docker 的 --ip-masq 为 true 时,创建的 SNAT 规则比较“暴力”:将所有本节点 Pod 发起的、访问非 docker0 接口的请求做 SNAT,这样访问其他节点 Pod 的请求来源 IP 会被设置为 flannel.1 接口的 IP,导致目的 Pod 看不到真实的来源 Pod IP。 flanneld 创建的 SNAT 规则比较温和,只对访问非 Pod 网段的请求做 SNAT
-
分发flanneld服务
cd /opt/k8s/work cp flanneld.service /etc/systemd/system/ export node_ip=192.168.0.114 scp flanneld.service root@${node_ip}:/etc/systemd/system/
-
启动flanneld服务
systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld ssh root@${node_ip) "systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld"
-
检查启动结果
systemctl status flanneld|grep Active export node_ip=192.168.0.114 ssh root@${node_ip} "systemctl status flanneld|grep Active"
-
确保状态为 active (running),否则查看日志,确认原因
-
如果出现异常,通过如下命令查看
journalctl -u flanneld
-
-
检查分配给各 flanneld 的 Pod 网段信息
export FLANNEL_ETCD_PREFIX="/kubernetes/network" export ETCD_ENDPOINTS="https://192.168.0.107:2379" etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/flanneld/cert/flanneld.pem --key-file=/etc/flanneld/cert/flanneld-key.pem get ${FLANNEL_ETCD_PREFIX}/config
输出结果
{"Network":"172.30.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}
-
查看已分配的 Pod 子网段列表
export FLANNEL_ETCD_PREFIX="/kubernetes/network" export ETCD_ENDPOINTS="https://192.168.0.107:2379" etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/flanneld/cert/flanneld.pem --key-file=/etc/flanneld/cert/flanneld-key.pem ls ${FLANNEL_ETCD_PREFIX}/subnets
输出结果
/kubernetes/network/subnets/172.30.22.0-24 /kubernetes/network/subnets/172.30.78.0-24
-
检查节点 flannel 网络信息
root@master:/opt/k8s/work# ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000 link/ether 04:92:26:13:92:2b brd ff:ff:ff:ff:ff:ff 3: wlp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d0:c5:d3:57:73:01 brd ff:ff:ff:ff:ff:ff inet 192.168.0.107/24 brd 192.168.0.255 scope global dynamic noprefixroute wlp3s0 valid_lft 6385sec preferred_lft 6385sec inet6 fe80::1fda:e90a:207a:67e4/64 scope link noprefixroute valid_lft forever preferred_lft forever 4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default link/ether 12:cb:66:43:de:36 brd ff:ff:ff:ff:ff:ff inet 172.30.22.0/32 scope global flannel.1 valid_lft forever preferred_lft forever inet6 fe80::10cb:66ff:fe43:de36/64 scope link valid_lft forever preferred_lft forever root@master:/opt/k8s/work# ip route show |grep flannel.1 172.30.78.0/24 via 172.30.78.0 dev flannel.1 onlink
-
验证各节点能通过 Pod 网段互通
root@master:/opt/k8s/work# ip addr show flannel.1 |grep -w inet inet 172.30.22.0/32 scope global flannel.1 root@master:/opt/k8s/work# ssh 192.168.0.114 "/sbin/ip addr show flannel.1|grep -w inet" inet 172.30.78.0/32 scope global flannel.1 root@master:/opt/k8s/work# ping -c 1 172.30.78.0 PING 172.30.78.0 (172.30.78.0) 56(84) bytes of data. 64 bytes from 172.30.78.0: icmp_seq=1 ttl=64 time=80.7 ms --- 172.30.78.0 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 80.707/80.707/80.707/0.000 ms root@master:/opt/k8s/work# ssh 192.168.0.114 "ping -c 1 172.30.22.0" PING 172.30.22.0 (172.30.22.0) 56(84) bytes of data. 64 bytes from 172.30.22.0: icmp_seq=1 ttl=64 time=4.09 ms --- 172.30.22.0 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 4.094/4.094/4.094/0.000 ms
-
生成文件
root@master:/opt/k8s/work# cat /run/flannel/subnet.env FLANNEL_NETWORK=172.30.0.0/16 FLANNEL_SUBNET=172.30.22.1/24 FLANNEL_MTU=1450 FLANNEL_IPMASQ=true root@master:/opt/k8s/work# cat /run/flannel/docker DOCKER_OPT_BIP="--bip=172.30.22.1/24" DOCKER_OPT_IPMASQ="--ip-masq=false" DOCKER_OPT_MTU="--mtu=1450" DOCKER_NETWORK_OPTIONS=" --bip=172.30.22.1/24 --ip-masq=false --mtu=1450"
部署docker服务(在master节点上执行)
-
下载和分发 docker 二进制文件
cd /opt/k8s/work wget https://download.docker.com/linux/static/stable/x86_64/docker-18.09.6.tgz tar -xvf docker-18.09.6.tgz
-
分发二进制文件到所有 worker 节点
cd /opt/k8s/work export node_ip=192.168.0.114 scp docker/* root@${node_ip}:/opt/k8s/bin/ ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
-
创建docker服务启动文件
cd /opt/k8s/work cat > docker.service <<"EOF" [Unit] Description=Docker Application Container Engine Documentation=http://docs.docker.io [Service] WorkingDirectory=/data/k8s/docker Environment="PATH=/opt/k8s/bin:/bin:/sbin:/usr/bin:/usr/sbin" EnvironmentFile=-/run/flannel/docker ExecStart=/opt/k8s/bin/dockerd $DOCKER_NETWORK_OPTIONS ExecReload=/bin/kill -s HUP $MAINPID Restart=on-failure RestartSec=5 LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity Delegate=yes KillMode=process [Install] WantedBy=multi-user.target EOF
-
EOF 前后有双引号,这样 bash 不会替换文档中的变量,如 $DOCKER_NETWORK_OPTIONS (这些环境变量是 systemd 负责替换的。);
-
dockerd 运行时会调用其它 docker 命令,如 docker-proxy,所以需要将 docker 命令所在的目录加到 PATH 环境变量中;
-
flanneld 启动时将网络配置写入 /run/flannel/docker 文件中,dockerd 启动前读取该文件中的环境变量 DOCKER_NETWORK_OPTIONS ,然后设置 docker0 网桥网段;
-
docker 从 1.13 版本开始,可能将 iptables FORWARD chain的默认策略设置为DROP,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 ACCEPT:
export node_ip=192.168.0.114 ssh root@${node_ip} "/sbin/iptables -P FORWARD ACCEPT"
-
-
分发 docker.service 文件到所有 worker 机器:
cd /opt/k8s/work export node_ip=192.168.0.114 scp docker.service root@${node_ip}:/etc/systemd/system/
-
配置和分发 docker 配置文件
使用国内的仓库镜像服务器以加快 pull image 的速度,同时增加下载的并发数 (需要重启 dockerd 生效):
cd /opt/k8s/work cat > docker-daemon.json <<EOF { "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn","https://hub-mirror.c.163.com"], "max-concurrent-downloads": 20, "live-restore": true, "max-concurrent-uploads": 10, "data-root": "/data/k8s/docker/data", "log-opts": { "max-size": "100m", "max-file": "5" } } EOF
-
分发 docker 配置文件到所有 worker 节点:
cd /opt/k8s/work export node_ip=192.168.0.114 ssh root@${node_ip} "mkdir -p /etc/docker/ /data/k8s/docker/data" scp docker-daemon.json root@${node_ip}:/etc/docker/daemon.json
-
启动 docker 服务
export node_ip=192.168.0.114 ssh root@${node_ip} "systemctl daemon-reload && systemctl enable docker && systemctl restart docker"
-
检查服务运行状态
export node_ip=192.168.0.114 ssh root@${node_ip} "systemctl status docker|grep Active"
-
确保状态为 active (running),否则查看日志,确认原因
-
如果出现异常,通过如下命令查看
journalctl -u docker
-
-
检查 docker0 网桥
export node_ip=192.168.0.114 ssh root@${node_ip} "/sbin/ip addr show flannel.1 && /sbin/ip addr show docker0"
-
确认各 worker 节点的 docker0 网桥和 flannel.1 接口的 IP 处于同一个网段中
输出内容
export node_ip=192.168.0.114 root@master:/opt/k8s/work# ssh root@${node_ip} "/sbin/ip addr show flannel.1 && /sbin/ip addr show docker0" 4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default link/ether f2:fc:0f:7e:98:e4 brd ff:ff:ff:ff:ff:ff inet 172.30.78.0/32 scope global flannel.1 valid_lft forever preferred_lft forever inet6 fe80::f0fc:fff:fe7e:98e4/64 scope link valid_lft forever preferred_lft forever 5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:fd:1f:8f:d8 brd ff:ff:ff:ff:ff:ff inet 172.30.78.1/24 brd 172.30.78.255 scope global docker0 valid_lft forever preferred_lft forever
-
注意: 如果您的服务安装顺序不对或者机器环境比较复杂, docker服务早于flanneld服务安装,此时 worker 节点的 docker0 网桥和 flannel.1 接口的 IP可能不会同处同一个网段下,这个时候请先停止docker服务, 手工删除docker0网卡,重新启动docker服务后即可修复
systemctl stop docker ip link delete docker0 systemctl start docker
-
-
查看 docker 的状态信息
root@slave:/opt/k8s/work# docker info Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 0 Server Version: 18.09.6 Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Native Overlay Diff: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84 runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30 init version: fec3683 Security Options: apparmor seccomp Profile: default Kernel Version: 5.0.0-23-generic Operating System: Ubuntu 18.04.3 LTS OSType: linux Architecture: x86_64 CPUs: 4 Total Memory: 3.741GiB Name: slave ID: IDMG:7A6F:UNTP:IWVM:ZBK5:VHJ4:STC5:UXZX:HQT6:UUNE:YDOC:I27L Docker Root Dir: /data/k8s/docker/data Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Registry Mirrors: https://docker.mirrors.ustc.edu.cn/ https://hub-mirror.c.163.com/ Live Restore Enabled: true Product License: Community Engine WARNING: No swap limit support
部署 master 节点(在master节点上执行)
-
下载最新版本二进制文件
cd /opt/k8s/work wget https://dl.k8s.io/v1.17.2/kubernetes-server-linux-amd64.tar.gz # 目前国内不能直接下载,需翻墙 tar -xzvf kubernetes-server-linux-amd64.tar
-
安装对应的k8s命令
cd /opt/k8s/work cp kubernetes/server/bin/{apiextensions-apiserver,kubeadm,kube-apiserver,kube-controller-manager,kubectl,kubelet,kube-proxy,kube-scheduler,mounter} /opt/k8s/bin/ #将kubelet、kube-proxy分发到worker节点 export node_ip=192.168.0.114 scp kubernetes/server/bin/{kubelet,kube-proxy} root@${node_ip}:/opt/k8s/bin/
配置kubectl
kubectl 使用 https 协议与 kube-apiserver 进行安全通信,kube-apiserver 对 kubectl 请求包含的证书进行认证和授权。
kubectl 后续用于集群管理,所以这里创建具有最高权限的 admin 证书。
-
创建 admin 证书和私钥
-
创建证书签名请求文件
cd /opt/k8s/work cat > admin-csr.json <<EOF { "CN": "admin", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "NanJing", "L": "NanJing", "O": "system:masters", "OU": "system" } ] } EOF
- O: system:masters:kube-apiserver 收到使用该证书的客户端请求后,为请求添加组(Group)认证标识 system:masters;
- 预定义的 ClusterRoleBinding cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予操作集群所需的最高权限;
- 该证书只会被 kubectl 当做 client 证书使用,所以 hosts 字段为空;
-
生成证书和私钥
cd /opt/k8s/work cfssl gencert -ca=/opt/k8s/work/ca.pem -ca-key=/opt/k8s/work/ca-key.pem -config=/opt/k8s/work/ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin ls admin*
-
安装证书
cd /opt/k8s/work cp admin*.pem /etc/kubernetes/cert
-
-
创建 kubeconfig 文件
cd /opt/k8s/work export KUBE_APISERVER=https://192.168.0.107:6443 # 设置集群参数 kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/cert/ca.pem --embed-certs=true --server=${KUBE_APISERVER} --kubeconfig=kubectl.kubeconfig # 设置客户端认证参数 kubectl config set-credentials admin --client-certificate=/etc/kubernetes/cert/admin.pem --client-key=/etc/kubernetes/cert/admin-key.pem --embed-certs=true --kubeconfig=kubectl.kubeconfig # 设置上下文参数 kubectl config set-context kubernetes --cluster=kubernetes --user=admin --kubeconfig=kubectl.kubeconfig # 设置默认上下文 kubectl config use-context kubernetes --kubeconfig=kubectl.kubeconfig
- --certificate-authority:验证 kube-apiserver 证书的根证书;
- --client-certificate、--client-key:刚生成的 admin 证书和私钥,与 kube-apiserver https 通信时使用;
- --embed-certs=true:将 ca.pem 和 admin.pem 证书内容嵌入到生成的 kubectl.kubeconfig 文件中;
- --server:指定 kube-apiserver 的地址;
-
分发 kubeconfig 文件(其他用户想要访问kubernetes时,也需要把此文件copy到对应的用户目录)
cd /opt/k8s/work mkdir -p ~/.kube cp kubectl.kubeconfig ~/.kube/config
-
配置kubectl自动补全功能
root@master:/opt/k8s/work# apt install -y bash-completion root@master:/opt/k8s/work# locate bash_completion /usr/share/bash-completion/bash_completion root@master:/opt/k8s/work# source /usr/share/bash-completion/bash_completion root@master:/opt/k8s/work# source <(kubectl completion bash) root@master:/opt/k8s/work# echo 'source <(kubectl completion bash)' >>~/.bashrc
配置kube-apiserver
-
创建 kubernetes-api 证书和私钥
-
创建证书签名请求文件
cd /opt/k8s/work cat > kubernetes-csr.json <<EOF { "CN": "kubernetes-api", "hosts": [ "127.0.0.1", "192.168.0.107", "10.254.0.1", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local." ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "NanJing", "L": "NanJing", "O": "k8s", "OU": "system" } ] } EOF
-
生成证书和私钥
cd /opt/k8s/work cfssl gencert -ca=/opt/k8s/work/ca.pem -ca-key=/opt/k8s/work/ca-key.pem -config=/opt/k8s/work/ca-config.json -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes ls kubernetes*
-
安装证书
cd /opt/k8s/work cp kubernetes*.pem /etc/kubernetes/cert/
-
-
创建kube-api服务启动文件
export ETCD_ENDPOINTS="https://192.168.0.107:2379" export SERVICE_CIDR="10.254.0.0/16" export NODE_PORT_RANGE=80-60000 cat > /etc/systemd/system/kube-apiserver.service <<EOF [Unit] Description=Kubernetes API Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target [Service] WorkingDirectory=/data/k8s/k8s/kube-apiserver ExecStart=/opt/k8s/bin/kube-apiserver \ --advertise-address=192.168.0.107 \ --etcd-cafile=/etc/kubernetes/cert/ca.pem \ --etcd-certfile=/etc/kubernetes/cert/kubernetes.pem \ --etcd-keyfile=/etc/kubernetes/cert/kubernetes-key.pem \ --etcd-servers=${ETCD_ENDPOINTS} \ --bind-address=192.168.0.107 \ --secure-port=6443 \ --tls-cert-file=/etc/kubernetes/cert/kubernetes.pem \ --tls-private-key-file=/etc/kubernetes/cert/kubernetes-key.pem \ --audit-log-maxage=15 \ --audit-log-maxbackup=3 \ --audit-log-maxsize=100 \ --audit-log-truncate-enabled \ --audit-log-path=/data/k8s/k8s/kube-apiserver/audit.log \ --profiling \ --anonymous-auth=false \ --client-ca-file=/etc/kubernetes/cert/ca.pem \ --enable-bootstrap-token-auth \ --service-account-key-file=/etc/kubernetes/cert/ca-key.pem \ --authorization-mode=Node,RBAC \ --runtime-config=api/all=true \ --allow-privileged=true \ --event-ttl=168h \ --kubelet-certificate-authority=/etc/kubernetes/cert/ca.pem \ --kubelet-client-certificate=/etc/kubernetes/cert/kubernetes.pem \ --kubelet-client-key=/etc/kubernetes/cert/kubernetes-key.pem \ --kubelet-https=true \ --kubelet-timeout=10s \ --service-cluster-ip-range=${SERVICE_CIDR} \ --service-node-port-range=${NODE_PORT_RANGE} \ --logtostderr=true \ --v=2 Restart=on-failure RestartSec=10 Type=notify LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
-
创建kube-api工作目录
mkdir -p /data/k8s/k8s/kube-apiserver
-
启动 kube-apiserver 服务
systemctl daemon-reload && systemctl enable kube-apiserver && systemctl restart kube-apiserver
-
检查启动结果
systemctl status kube-apiserver |grep Active
-
确保状态为 active (running),否则查看日志,确认原因
-
如果出现异常,通过如下命令查看
journalctl -u kube-apiserver
-
-
检查 kube-apiserver 运行状态
root@master:/opt/k8s/work# kubectl cluster-info Kubernetes master is running at https://192.168.0.107:6443 To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. root@master:/opt/k8s/work# kubectl get all --all-namespaces NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default service/kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 2m30s root@master:/opt/k8s/work# kubectl get componentstatuses NAME STATUS MESSAGE ERROR scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused etcd-0 Healthy {"health":"true"}
配置kube-controller-manager
-
创建 kube-controller-manager 证书和私钥
-
创建证书签名请求文件
cd /opt/k8s/work cat > kube-controller-manager-csr.json <<EOF { "CN": "system:kube-controller-manager", "key": { "algo": "rsa", "size": 2048 }, "hosts": [ "127.0.0.1", "192.168.0.107" ], "names": [ { "C": "CN", "ST": "NanJing", "L": "NanJing", "O": "system:kube-controller-manager", "OU": "system" } ] } EOF
- CN 和 O 均为 system:kube-controller-manager,kubernetes 内置的 ClusterRoleBindings system:kube-controller-manager 赋予 kube-controller-manager 工作所需的权限。
-
生成证书和私钥
cd /opt/k8s/work cfssl gencert -ca=/opt/k8s/work/ca.pem -ca-key=/opt/k8s/work/ca-key.pem -config=/opt/k8s/work/ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager ls kube-controller-manager*pem
-
安装证书
cd /opt/k8s/work cp kube-controller-manager*.pem /etc/kubernetes/cert/
-
-
创建 kubeconfig 文件
- kube-controller-manager 使用此文件访问apiserver,该文件提供了 apiserver 地址、嵌入的 CA 证书和 kube-controller-manager 证书等信息
cd /opt/k8s/work export KUBE_APISERVER=https://192.168.0.107:6443 kubectl config set-cluster kubernetes --certificate-authority=/opt/k8s/work/ca.pem --embed-certs=true --server="${KUBE_APISERVER}" --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-credentials system:kube-controller-manager --client-certificate=kube-controller-manager.pem --client-key=kube-controller-manager-key.pem --embed-certs=true --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-context system:kube-controller-manager --cluster=kubernetes --user=system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
-
分发 kubeconfig
cd /opt/k8s/work cp kube-controller-manager.kubeconfig /etc/kubernetes/kube-controller-manager.kubeconfig
-
创建kube-controller-manager服务启动文件
export SERVICE_CIDR="10.254.0.0/16" cat > /etc/systemd/system/kube-controller-manager.service <<EOF [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] WorkingDirectory=/data/k8s/k8s/kube-controller-manager ExecStart=/opt/k8s/bin/kube-controller-manager \ --profiling \ --cluster-name=kubernetes \ --kube-api-qps=1000 \ --kube-api-burst=2000 \ --leader-elect \ --use-service-account-credentials\ --concurrent-service-syncs=2 \ --bind-address=192.168.0.107 \ --secure-port=10252 \ --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem \ --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem \ --port=0 \ --authentication-kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \ --client-ca-file=/etc/kubernetes/cert/ca.pem \ --authorization-kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \ --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem \ --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem \ --experimental-cluster-signing-duration=87600h \ --horizontal-pod-autoscaler-sync-period=10s \ --concurrent-deployment-syncs=10 \ --concurrent-gc-syncs=30 \ --node-cidr-mask-size=24 \ --service-cluster-ip-range=${SERVICE_CIDR} \ --pod-eviction-timeout=6m \ --terminated-pod-gc-threshold=10000 \ --root-ca-file=/etc/kubernetes/cert/ca.pem \ --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem \ --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \ --logtostderr=true \ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF
-
创建kube-controller-manager工作目录
mkdir -p /data/k8s/k8s/kube-controller-manager
-
启动 kube-controller-manager服务
systemctl daemon-reload && systemctl enable kube-controller-manager && systemctl restart kube-controller-manager
-
检查启动结果
systemctl status kube-controller-manager |grep Active
-
确保状态为 active (running),否则查看日志,确认原因
-
如果出现异常,通过如下命令查看
journalctl -u kube-controller-manager
-
-
检查 kube-controller-manager 运行状态
root@master:/opt/k8s/work# kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"master_6e2dfb91-8eaa-42d0-ba83-be669b99801f","leaseDurationSeconds":15,"acquireTime":"2020-02-09T13:37:08Z","renewTime":"2020-02-09T13:38:02Z","leaderTransitions":0}' creationTimestamp: "2020-02-09T13:37:08Z" name: kube-controller-manager namespace: kube-system resourceVersion: "888" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager uid: 5aa2c4a1-5ded-4870-900e-63dfd212c912 root@master:/opt/k8s/work# curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://192.168.0.107:10252/healthz ok
配置kube-scheduler
-
创建 kube-scheduler 证书和私钥
-
创建证书签名请求文件
cd /opt/k8s/work cat > kube-scheduler-csr.json <<EOF { "CN": "system:kube-scheduler", "key": { "algo": "rsa", "size": 2048 }, "hosts": [ "127.0.0.1", "192.168.0.107" ], "names": [ { "C": "CN", "ST": "NanJing", "L": "NanJing", "O": "system:kube-scheduler", "OU": "system" } ] } EOF
- CN 和 O 均为 system:kube-scheduler,kubernetes 内置的 ClusterRoleBindings system:kube-scheduler 赋予 kube-scheduler 工作所需的权限。
-
生成证书和私钥
cd /opt/k8s/work cfssl gencert -ca=/opt/k8s/work/ca.pem -ca-key=/opt/k8s/work/ca-key.pem -config=/opt/k8s/work/ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler ls kube-scheduler*pem
-
安装证书
cd /opt/k8s/work cp kube-scheduler*.pem /etc/kubernetes/cert/
-
-
创建 kubeconfig 文件
- kube-scheduler 使用此文件访问apiserver,该文件提供了 apiserver 地址、嵌入的 CA 证书和 kube-scheduler证书等信息
cd /opt/k8s/work export KUBE_APISERVER=https://192.168.0.107:6443 kubectl config set-cluster kubernetes --certificate-authority=/opt/k8s/work/ca.pem --embed-certs=true --server="${KUBE_APISERVER}" --kubeconfig=kube-scheduler.kubeconfig kubectl config set-credentials system:kube-scheduler --client-certificate=kube-scheduler.pem --client-key=kube-scheduler-key.pem --embed-certs=true --kubeconfig=kube-scheduler.kubeconfig kubectl config set-context system:kube-scheduler --cluster=kubernetes --user=system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig
-
分发 kubeconfig
cd /opt/k8s/work cp kube-scheduler.kubeconfig /etc/kubernetes/kube-scheduler.kubeconfig
-
创建 kube-scheduler 配置文件
cd /opt/k8s/work cat >kube-scheduler.yaml <<EOF apiVersion: kubescheduler.config.k8s.io/v1alpha1 kind: KubeSchedulerConfiguration bindTimeoutSeconds: 600 clientConnection: burst: 200 kubeconfig: "/etc/kubernetes/kube-scheduler.kubeconfig" qps: 100 enableContentionProfiling: false enableProfiling: true hardPodAffinitySymmetricWeight: 1 healthzBindAddress: 192.168.0.107:10251 leaderElection: leaderElect: true metricsBindAddress: 192.168.0.107:10251 EOF cp kube-scheduler.yaml /etc/kubernetes/kube-scheduler.yaml
-
创建kube-scheduler服务启动文件
cat > /etc/systemd/system/kube-scheduler.service <<EOF [Unit] Description=Kubernetes Scheduler Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] WorkingDirectory=/data/k8s/k8s/kube-scheduler ExecStart=/opt/k8s/bin/kube-scheduler \ --config=/etc/kubernetes/kube-scheduler.yaml \ --bind-address=192.168.0.107 \ --secure-port=10259 \ --port=0 \ --tls-cert-file=/etc/kubernetes/cert/kube-scheduler.pem \ --tls-private-key-file=/etc/kubernetes/cert/kube-scheduler-key.pem \ --authentication-kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \ --client-ca-file=/etc/kubernetes/cert/ca.pem \ --authorization-kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \ --logtostderr=true \ --v=2 Restart=always RestartSec=5 StartLimitInterval=0 [Install] WantedBy=multi-user.target EOF
-
创建kube-scheduler工作目录
mkdir -p /data/k8s/k8s/kube-scheduler
-
启动 kube-scheduler服务
systemctl daemon-reload && systemctl enable kube-scheduler && systemctl restart kube-scheduler
-
检查启动结果
systemctl status kube-scheduler |grep Active
-
确保状态为 active (running),否则查看日志,确认原因
-
如果出现异常,通过如下命令查看
journalctl -u kube-scheduler
-
-
检查 kube-scheduler 运行状态
root@master:/opt/k8s/work# kubectl get endpoints kube-scheduler --namespace=kube-system -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"master_383054c4-58d8-4c24-a766-551a92492219","leaseDurationSeconds":15,"acquireTime":"2020-02-10T02:17:40Z","renewTime":"2020-02-10T02:18:09Z","leaderTransitions":0}' creationTimestamp: "2020-02-10T02:17:41Z" name: kube-scheduler namespace: kube-system resourceVersion: "50203" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-scheduler uid: 39821272-40a1-4b3a-95bd-a4f09af09231 root@master:/opt/k8s/work# curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://192.168.0.107:10259/healthz ok root@master:/opt/k8s/work# curl http://192.168.0.107:10251/healthz ok
部署worker节点(在master节点上执行)
配置kubelet
kubelet 运行在每个 worker 节点上,接收 kube-apiserver 发送的请求,管理 Pod 容器,执行交互式命令,如 exec、run、logs 等。
kubelet 启动时自动向 kube-apiserver 注册节点信息,内置的 cadvisor 统计和监控节点的资源使用情况。
为确保安全,部署时关闭了 kubelet 的非安全 http 端口,对请求进行认证和授权,拒绝未授权的访问(如 apiserver、heapster 的请求)。
-
创建 kubelet bootstrap kubeconfig 文件
cd /opt/k8s/work export KUBE_APISERVER=https://192.168.0.107:6443 export node_name=slave export BOOTSTRAP_TOKEN=$(kubeadm token create --description kubelet-bootstrap-token --groups system:bootstrappers:${node_name} --kubeconfig ~/.kube/config) # 设置集群参数 kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/cert/ca.pem --embed-certs=true --server=${KUBE_APISERVER} --kubeconfig=kubelet-bootstrap.kubeconfig # 设置客户端认证参数 kubectl config set-credentials kubelet-bootstrap --token=${BOOTSTRAP_TOKEN} --kubeconfig=kubelet-bootstrap.kubeconfig # 设置上下文参数 kubectl config set-context default --cluster=kubernetes --user=kubelet-bootstrap --kubeconfig=kubelet-bootstrap.kubeconfig # 设置默认上下文 kubectl config use-context default --kubeconfig=kubelet-bootstrap.kubeconfig
- 向 kubeconfig 写入的是 token,bootstrap 结束后 kube-controller-manager 为 kubelet 创建 client 和 server 证书
- kube-apiserver 接收 kubelet 的 bootstrap token 后,将请求的 user 设置为 system:bootstrap:
,group 设置为 system:bootstrappers,后续将为这个 group 设置 ClusterRoleBinding
-
分发 bootstrap kubeconfig 文件到所有 worker 节点
cd /opt/k8s/work export node_ip=192.168.0.114 scp kubelet-bootstrap.kubeconfig root@${node_ip}:/etc/kubernetes/kubelet-bootstrap.kubeconfig
-
创建和分发 kubelet 参数配置文件
从 v1.10 开始,部分 kubelet 参数需在配置文件中配置,kubelet --help 会提示
cd /opt/k8s/work export CLUSTER_CIDR="172.30.0.0/16" export NODE_IP=192.168.0.114 export CLUSTER_DNS_SVC_IP="10.254.0.2" cat > kubelet-config.yaml <<EOF kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 address: ${NODE_IP} staticPodPath: "/etc/kubernetes/manifests" syncFrequency: 1m fileCheckFrequency: 20s httpCheckFrequency: 20s staticPodURL: "" port: 10250 readOnlyPort: 0 rotateCertificates: true serverTLSBootstrap: true authentication: anonymous: enabled: false webhook: enabled: true x509: clientCAFile: "/etc/kubernetes/cert/ca.pem" authorization: mode: Webhook registryPullQPS: 0 registryBurst: 20 eventRecordQPS: 0 eventBurst: 20 enableDebuggingHandlers: true enableContentionProfiling: true healthzPort: 10248 healthzBindAddress: ${NODE_IP} clusterDomain: "cluster.local" clusterDNS: - "${CLUSTER_DNS_SVC_IP}" nodeStatusUpdateFrequency: 10s nodeStatusReportFrequency: 1m imageMinimumGCAge: 2m imageGCHighThresholdPercent: 85 imageGCLowThresholdPercent: 80 volumeStatsAggPeriod: 1m kubeletCgroups: "" systemCgroups: "" cgroupRoot: "" cgroupsPerQOS: true cgroupDriver: cgroupfs runtimeRequestTimeout: 10m hairpinMode: promiscuous-bridge maxPods: 220 podCIDR: "${CLUSTER_CIDR}" podPidsLimit: -1 resolvConf: /run/systemd/resolve/resolv.conf maxOpenFiles: 1000000 kubeAPIQPS: 1000 kubeAPIBurst: 2000 serializeImagePulls: false evictionHard: memory.available: "100Mi" nodefs.available: "10%" nodefs.inodesFree: "5%" imagefs.available: "15%" evictionSoft: {} enableControllerAttachDetach: true failSwapOn: true containerLogMaxSize: 20Mi containerLogMaxFiles: 10 systemReserved: {} kubeReserved: {} systemReservedCgroup: "" kubeReservedCgroup: "" enforceNodeAllocatable: ["pods"] EOF
- address:kubelet 安全端口(https,10250)监听的地址,不能为 127.0.0.1,否则 kube-apiserver、heapster 等不能调用 kubelet 的 API;
- readOnlyPort=0:关闭只读端口(默认 10255),等效为未指定;
- authentication.anonymous.enabled:设置为 false,不允许匿名访问 10250 端口;
- authentication.x509.clientCAFile:指定签名客户端证书的 CA 证书,开启 HTTP 证书认证;
- authentication.webhook.enabled=true:开启 HTTPs bearer token 认证;
对于未通过 x509 证书和 webhook 认证的请求(kube-apiserver 或其他客户端),将被拒绝,提示 Unauthorized; - authroization.mode=Webhook:kubelet 使用 SubjectAccessReview API 查询 kube-apiserver 某 user、group 是否具有操作资源的权限(RBAC);
- featureGates.RotateKubeletClientCertificate、featureGates.RotateKubeletServerCertificate:自动 rotate 证书,证书的有效期取决于 kube-controller-manager 的 --experimental-cluster-signing-duration 参数
-
为各节点创建和分发 kubelet 配置文件
cd /opt/k8s/work export node_ip=192.168.0.114 scp kubelet-config.yaml root@${node_ip}:/etc/kubernetes/kubelet-config.yaml
-
创建和分发 kubelet 服务启动文件
cd /opt/k8s/work export K8S_DIR=/data/k8s/k8s export NODE_NAME=slave cat > kubelet.service <<EOF [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=docker.service Requires=docker.service [Service] WorkingDirectory=${K8S_DIR}/kubelet ExecStart=/opt/k8s/bin/kubelet \ --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \ --cert-dir=/etc/kubernetes/cert \ --root-dir=${K8S_DIR}/kubelet \ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \ --config=/etc/kubernetes/kubelet-config.yaml \ --hostname-override=${NODE_NAME} \ --image-pull-progress-deadline=15m \ --volume-plugin-dir=${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/ \ --logtostderr=true \ --v=2 Restart=always RestartSec=5 StartLimitInterval=0 [Install] WantedBy=multi-user.target EOF
- 如果设置了 --hostname-override 选项,则 kube-proxy 也需要设置该选项,否则会出现找不到 Node 的情况;
- --bootstrap-kubeconfig:指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求;
- K8S approve kubelet 的 csr 请求后,在 --cert-dir 目录创建证书和私钥文件,然后写入 --kubeconfig 文件
-
安装分发kubelet服务文件
cd /opt/k8s/work export node_ip=192.168.0.114 scp kubelet.service root@${node_ip}:/etc/systemd/system/kubelet.service
-
授予 kube-apiserver 访问 kubelet API 的权限
在执行 kubectl exec、run、logs 等命令时,apiserver 会将请求转发到 kubelet 的 https 端口。这里定义 RBAC 规则,授权 apiserver 使用的证书(kubernetes.pem)用户名(CN:kubernetes-api)访问 kubelet API 的权限:
kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes-api
-
Bootstrap Token Auth 和授予权限
kubelet 启动时查找 --kubeletconfig 参数对应的文件是否存在,如果不存在则使用 --bootstrap-kubeconfig 指定的 kubeconfig 文件向 kube-apiserver 发送证书签名请求 (CSR)。kube-apiserver 收到 CSR 请求后,对其中的 Token 进行认证,认证通过后将请求的 user 设置为 system:bootstrap:
,group 设置为 system:bootstrappers,这一过程称为 Bootstrap Token Auth。 默认情况下,这个 user 和 group 没有创建 CSR 的权限, 需要创建一个 clusterrolebinding,将 group system:bootstrappers 和 clusterrole system:node-bootstrapper 绑定:
kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --group=system:bootstrappers
-
启动 kubelet 服务
export K8S_DIR=/data/k8s/k8s export node_ip=192.168.0.114 ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kubelet && systemctl restart kubelet"
-
kubelet 启动后使用 --bootstrap-kubeconfig 向 kube-apiserver 发送 CSR 请求,当这个 CSR 被 approve 后,kube-controller-manager 为 kubelet 创建 TLS 客户端证书、私钥和 --kubeletconfig 文件。
-
注意:kube-controller-manager 需要配置 --cluster-signing-cert-file 和 --cluster-signing-key-file 参数,才会为 TLS Bootstrap 创建证书和私钥。
-
-
遇到问题
-
启动kubelet后,使用 kubectl get csr 没有结果,查看kubelet出现错误
journalctl -u kubelet -a |grep -A 2 'certificate_manager.go' Failed while requesting a signed certificate from the master: cannot create certificate signing request: Unauthorized
查看kube-api服务日志
root@master:/opt/k8s/work# journalctl -eu kube-apiserver Unable to authenticate the request due to an error: invalid bearer token
原因,在kube-apiserver服务的启动文件中丢掉了下面的配置
--enable-bootstrap-token-auth \
追加上,重新启动kube-apiserver后解决
-
kubelet 启动后持续不断的产生csr,手动approve后还继续产生
原因是kube-controller-manager服务停止掉了,重新启动后解决- kubelet服务出问题后 要删除对应节点的/etc/kubernetes/kubelet.kubeconfig和/etc/kubernetes/cert/kubelet-client-current*.pem、/etc/kubernetes/cert/kubelet-client-current*.pem,之后再重新启动kubelet
-
-
查看 kubelet 情况
root@master:/opt/k8s/work# kubectl get csr NAME AGE REQUESTOR CONDITION csr-kl5mg 49s system:bootstrap:5t989l Pending csr-mrmkf 2m1s system:bootstrap:5t989l Pending csr-ql68g 13s system:bootstrap:5t989l Pending csr-rvl2v 84s system:bootstrap:5t989l Pending
- 执行时,在手动approve之前会一直追加csr
-
手动 approve csr
root@master:/opt/k8s/work# kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve certificatesigningrequest.certificates.k8s.io/csr-kl5mg approved certificatesigningrequest.certificates.k8s.io/csr-mrmkf approved certificatesigningrequest.certificates.k8s.io/csr-ql68g approved certificatesigningrequest.certificates.k8s.io/csr-rvl2v approved root@master:/opt/k8s/work# kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve certificatesigningrequest.certificates.k8s.io/csr-f4smx approved
-
查看node信息
root@master:/opt/k8s/work# kubectl get nodes NAME STATUS ROLES AGE VERSION slave Ready <none> 10m v1.17.2
-
查看kubelet服务状态
export node_ip=192.168.0.114 root@master:/opt/k8s/work# ssh root@${node_ip} "systemctl status kubelet.service" ● kubelet.service - Kubernetes Kubelet Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2020-02-10 22:48:41 CST; 12min ago Docs: https://github.com/GoogleCloudPlatform/kubernetes Main PID: 15529 (kubelet) Tasks: 19 (limit: 4541) CGroup: /system.slice/kubelet.service └─15529 /opt/k8s/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig --cert-dir=/etc/kubernetes/cert --root-dir=/data/k8s/k8s/kubelet --kubeconfig=/etc/kubernetes/kubelet.kubeconfig --config=/etc/kubernetes/kubelet-config.yaml --hostname-override=slave --image-pull-progress-deadline=15m --volume-plugin-dir=/data/k8s/k8s/kubelet/kubelet-plugins/volume/exec/ --logtostderr=true --v=2 2月 10 22:49:04 slave kubelet[15529]: I0210 22:49:04.846285 15529 kubelet_node_status.go:73] Successfully registered node slave 2月 10 22:49:04 slave kubelet[15529]: I0210 22:49:04.930745 15529 certificate_manager.go:402] Rotating certificates 2月 10 22:49:14 slave kubelet[15529]: I0210 22:49:14.966351 15529 kubelet_node_status.go:486] Recording NodeReady event message for node slave 2月 10 22:49:29 slave kubelet[15529]: I0210 22:49:29.580410 15529 certificate_manager.go:531] Certificate expiration is 2030-02-06 04:19:00 +0000 UTC, rotation deadline is 2029-01-21 13:08:18.850930128 +0000 UTC 2月 10 22:49:29 slave kubelet[15529]: I0210 22:49:29.580484 15529 certificate_manager.go:281] Waiting 78430h18m49.270459727s for next certificate rotation 2月 10 22:49:30 slave kubelet[15529]: I0210 22:49:30.580981 15529 certificate_manager.go:531] Certificate expiration is 2030-02-06 04:19:00 +0000 UTC, rotation deadline is 2027-07-14 16:09:26.990162158 +0000 UTC 2月 10 22:49:30 slave kubelet[15529]: I0210 22:49:30.581096 15529 certificate_manager.go:281] Waiting 65065h19m56.409078053s for next certificate rotation 2月 10 22:53:44 slave kubelet[15529]: I0210 22:53:44.911705 15529 kubelet.go:1312] Image garbage collection succeeded 2月 10 22:53:45 slave kubelet[15529]: I0210 22:53:45.053792 15529 container_manager_linux.go:469] [ContainerManager]: Discovered runtime cgroups name: /system.slice/docker.service 2月 10 22:58:45 slave kubelet[15529]: I0210 22:58:45.054225 15529 container_manager_linux.go:469] [ContainerManager]: Discovered runtime cgroups name: /system.slice/docker.servic
配置kube-proxy 组件
-
创建 kube-proxy 证书和私钥
-
创建证书签名请求文件
cd /opt/k8s/work cat > kube-proxy-csr.json <<EOF { "CN": "system:kube-proxy", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "NanJing", "L": "NanJing", "O": "system:kube-proxy", "OU": "system" } ] } EOF
- CN:指定该证书的 User 为 system:kube-proxy;
- 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限。
-
生成证书和私钥
cd /opt/k8s/work cfssl gencert -ca=/opt/k8s/work/ca.pem -ca-key=/opt/k8s/work/ca-key.pem -config=/opt/k8s/work/ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy ls kube-proxy*pem
-
安装证书
cd /opt/k8s/work export node_ip=192.168.0.114 scp kube-proxy*.pem root@${node_ip}:/etc/kubernetes/cert/
-
-
创建 kubeconfig 文件
- kube-proxy 使用此文件访问apiserver,该文件提供了 apiserver 地址、嵌入的 CA 证书和 kube-proxy证书等信息
cd /opt/k8s/work export KUBE_APISERVER=https://192.168.0.107:6443 kubectl config set-cluster kubernetes --certificate-authority=/opt/k8s/work/ca.pem --embed-certs=true --server=${KUBE_APISERVER} --kubeconfig=kube-proxy.kubeconfig kubectl config set-credentials kube-proxy --client-certificate=kube-proxy.pem --client-key=kube-proxy-key.pem --embed-certs=true --kubeconfig=kube-proxy.kubeconfig kubectl config set-context default --cluster=kubernetes --user=kube-proxy --kubeconfig=kube-proxy.kubeconfig kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
-
分发 kubeconfig
cd /opt/k8s/work export node_ip=192.168.0.114 scp kube-proxy.kubeconfig root@${node_ip}:/etc/kubernetes/kube-proxy.kubeconfig
-
创建 kube-proxy 配置文件
cd /opt/k8s/work export CLUSTER_CIDR="172.30.0.0/16" export NODE_IP=192.168.0.114 export NODE_NAME=slave cat > kube-proxy-config.yaml <<EOF kind: KubeProxyConfiguration apiVersion: kubeproxy.config.k8s.io/v1alpha1 clientConnection: burst: 200 kubeconfig: "/etc/kubernetes/kube-proxy.kubeconfig" qps: 100 bindAddress: ${NODE_IP} healthzBindAddress: ${NODE_IP}:10256 metricsBindAddress: ${NODE_IP}:10249 enableProfiling: true clusterCIDR: ${CLUSTER_CIDR} hostnameOverride: ${NODE_NAME} mode: "ipvs" portRange: "" iptables: masqueradeAll: false ipvs: scheduler: rr excludeCIDRs: [] EOF
- bindAddress: 监听地址;
- clientConnection.kubeconfig: 连接 apiserver 的 kubeconfig 文件;
- clusterCIDR: kube-proxy 根据 --cluster-cidr 判断集群内部和外部流量,指定 --cluster-cidr 或 --masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
- hostnameOverride: 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 ipvs 规则;
- mode: 使用 ipvs 模式;
-
分发kube-proxy 配置文件
cd /opt/k8s/work export node_ip=192.168.0.114 scp kube-proxy-config.yaml root@${node_ip}:/etc/kubernetes/kube-proxy-config.yaml
-
创建kube-proxy服务启动文件
cd /opt/k8s/work export K8S_DIR=/data/k8s/k8s cat > kube-proxy.service <<EOF [Unit] Description=Kubernetes Kube-Proxy Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target [Service] WorkingDirectory=${K8S_DIR}/kube-proxy ExecStart=/opt/k8s/bin/kube-proxy \ --config=/etc/kubernetes/kube-proxy-config.yaml \ --logtostderr=true \ --v=2 Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
-
分发 kube-proxy服务启动文件:
export node_ip=192.168.0.114 scp kube-proxy.service root@${node_ip}:/etc/systemd/system/
-
启动 kube-proxy服务
export node_ip=192.168.0.114 export K8S_DIR=/data/k8s/k8s ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-proxy" ssh root@${node_ip} "modprobe ip_vs_rr" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-proxy && systemctl restart kube-proxy"
-
检查启动结果
export node_ip=192.168.0.114 ssh root@${node_ip} "systemctl status kube-proxy |grep Active"
-
确保状态为 active (running),否则查看日志,确认原因
-
如果出现异常,通过如下命令查看
journalctl -u kube-proxy
-
-
查看状态
root@slave:~# netstat -lnpt|grep kube-prox tcp 0 0 192.168.0.114:10256 0.0.0.0:* LISTEN 23078/kube-proxy tcp 0 0 192.168.0.114:10249 0.0.0.0:* LISTEN 23078/kube-proxy root@slave:~# ipvsadm -ln IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.254.0.1:443 rr -> 192.168.0.107:6443 Masq 1 0 0
验证集群功能(在master节点上执行)
以一个nginx的service和deployment来验证集群功能
-
创建启动文件
mkdir /opt/k8s/yml cd /opt/k8s/yml cat > nginx.yml << EOF apiVersion: v1 kind: Service metadata: name: nginx labels: app: nginx spec: type: NodePort selector: app: nginx ports: - name: http port: 80 targetPort: 80 nodePort: 8080 --- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: selector: matchLabels: app: nginx replicas: 1 template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.9.1 ports: - containerPort: 80 EOF
-
启动服务
kubectl create -f nginx.yml
-
第一次启动时需要下载k8s.gcr.io/pause:3.1镜像,国内无法直接下载,造成服务无法启动,通过下面操作来解决
docker pull kubeimage/pause:3.1 docker tag kubeimage/pause:3.1 k8s.gcr.io/pause:3.1
-
-
观察服务启动情况
root@master:/opt/k8s/yml# kubectl get service -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 41h <none> nginx NodePort 10.254.8.25 <none> 80:8080/TCP 30m app=nginx root@master:/opt/k8s/yml# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-deployment-56f8998dbc-955gf 1/1 Running 0 30m 172.30.78.2 slave <none> <none> root@master:/opt/k8s/yml# curl http://192.168.0.114:8080 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> body { 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html>
部署 coredns 插件(在master节点上执行)
-
下载和配置 coredns
cd /opt/k8s/work git clone https://github.com/coredns/deployment.git mv deployment coredns
-
启动 coredns
cd /opt/k8s/work/coredns/kubernetes export CLUSTER_DNS_SVC_IP="10.254.0.2" export CLUSTER_DNS_DOMAIN="cluster.local" ./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f -
-
遇到问题
启动coredns后,状态是CrashLoopBackOff
root@master:/opt/k8s/work/coredns/kubernetes# kubectl get pod -n kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-76b74f549-99bxd 0/1 CrashLoopBackOff 5 4m45s
```
查看coredns对应的pod日志有如下错误
```
root@master:/opt/k8s/work/coredns/kubernetes# kubectl -n kube-system logs coredns-76b74f549-99bxd
.:53
[INFO] plugin/reload: Running configuration MD5 = 8b19e11d5b2a72fb8e63383b064116a1
CoreDNS-1.6.6
linux/amd64, go1.13.5, 6a7a75e
[FATAL] plugin/loop: Loop (127.0.0.1:60429 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 6292641803451309721.7599235642583168995."
```
按照提示进入https://coredns.io/plugins/loop#troubleshooting页面,有如下表述
> When a CoreDNS Pod deployed in Kubernetes detects a loop, the CoreDNS Pod will start to “CrashLoopBackOff”. This is because Kubernetes will try to restart the Pod every time CoreDNS detects the loop and exits.
> A common cause of forwarding loops in Kubernetes clusters is an interaction with a local DNS cache on the host node (e.g. systemd-resolved). For example, in certain configurations systemd-resolved will put the loopback address 127.0.0.53 as a nameserver into /etc/resolv.conf. Kubernetes (via kubelet) by default will pass this /etc/resolv.conf file to all Pods using the default dnsPolicy rendering them unable to make DNS lookups (this includes CoreDNS Pods). CoreDNS uses this /etc/resolv.conf as a list of upstreams to forward requests to. Since it contains a loopback address, CoreDNS ends up forwarding requests to itself.
> There are many ways to work around this issue, some are listed here:
> * Add the following to your kubelet config yaml: resolvConf: <path-to-your-real-resolv-conf-file> (or via command line flag --resolv-conf deprecated in 1.10). Your “real” resolv.conf is the one that contains the actual IPs of your upstream servers, and no local/loopback address. This flag tells kubelet to pass an alternate resolv.conf to Pods. For systems using systemd-resolved, /run/systemd/resolve/resolv.conf is typically the location of the “real” resolv.conf, although this can be different depending on your distribution.
> * Disable the local DNS cache on host nodes, and restore /etc/resolv.conf to the original.
> * A quick and dirty fix is to edit your Corefile, replacing forward . /etc/resolv.conf with the IP address of your upstream DNS, for example forward . 8.8.8.8. But this only fixes the issue for CoreDNS, kubelet will continue to forward the invalid resolv.conf to all default dnsPolicy Pods, leaving them unable to resolve DNS.
按照提示的第一种解决方法,修改kubelet对应的配置文件kubelet-config.yaml中resolv-conf的值为/run/systemd/resolve/resolv.conf,配置片段如下
```
...
podPidsLimit: -1
resolvConf: /run/systemd/resolve/resolv.conf
maxOpenFiles: 1000000
...
```
重启kubelet服务
```
systemctl daemon-reload
systemctl restart kubelet
```
之后重新部署coredns
```
root@master:/opt/k8s/work/coredns/kubernetes# ./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f -
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created
root@master:/opt/k8s/work/coredns/kubernetes# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-76b74f549-j5t9c 1/1 Running 0 12s
root@master:/opt/k8s/work/coredns/kubernetes# kubectl get all -n kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
pod/coredns-76b74f549-j5t9c 1/1 Running 0 2m8s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP,9153/TCP 2m8s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/coredns 1/1 1 1 2m8s
NAME DESIRED CURRENT READY AGE
replicaset.apps/coredns-76b74f549 1 1 1 2m8s
```
-
启动一个busybox pod,并启动上一章节中验证集群功能的nginx服务,在busybox通过服务名,访问nginx服务
cd /opt/k8s/yml cat > busybox.yml << EOF apiVersion: v1 kind: Pod metadata: name: busybox spec: containers: - name: busybox image: busybox command: - sleep - "3600" EOF kubectl create -f busybox.yml kubectl create -f nginx.yml
-
进入busybox pod中访问nginx
root@master:/opt/k8s/yml# kubectl exec -it busybox sh / # cat /etc/resolv.conf nameserver 10.254.0.2 search default.svc.cluster.local svc.cluster.local cluster.local options ndots:5 / # nslookup www.baidu.com Server: 10.254.0.2 Address: 10.254.0.2:53 Non-authoritative answer: www.baidu.com canonical name = www.a.shifen.com Name: www.a.shifen.com Address: 183.232.231.174 Name: www.a.shifen.com Address: 183.232.231.172 / # nslookup kubernetes Server: 10.254.0.2 Address: 10.254.0.2:53 Name: kubernetes.default.svc.cluster.local Address: 10.254.0.1 / # nslookup nginx Server: 10.254.0.2 Address: 10.254.0.2:53 Name: nginx.default.svc.cluster.local Address: 10.254.19.32 / # ping -c 1 nginx PING nginx (10.254.19.32): 56 data bytes 64 bytes from 10.254.19.32: seq=0 ttl=64 time=0.155 ms --- nginx ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 0.155/0.155/0.155 ms
追加节点(在master上执行)
追加节点
资源有限,我们这边尝试把master节点追加到集群中,如果是新机器,需要执行本文档的 安装前准备,把ca相关的证书分发到这个机器上,部署 flannel 网络步骤
-
安装前准备(master节点已做过)
-
把ca相关的证书分发到这个机器上(master节点已做过)
-
部署 flannel 网络(master节点已做过)
-
安装docker服务
-
安装kubelet服务
参照之前追加salve节点的操作,如果直接使用之前的kubelet-bootstrap.yml,发现节点无法加入,因为kubelet-bootstrap.yml中的token值有效期只有一天,如果token已经过期,在kube-apiserver中会出现错误2月 12 11:01:01 master kube-apiserver[5018]: E0212 11:01:01.640497 5018 authentication.go:104] Unable to authenticate the request due to an error: invalid bearer token
查看token
root@master:/opt/k8s/work# kubeadm token list --kubeconfig ~/.kube/config TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS 5t989l.rweut7kedj7ifl1a <invalid> 2020-02-11T18:19:41+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:slave
此时需要按照slave节点上安装kubelet的步骤,重新生成kubelet-bootstrap.yml
将csr approve后,查看节点情况
root@master:/opt/k8s/work# kubectl get nodes NAME STATUS ROLES AGE VERSION master Ready <none> 21s v1.17.2 slave Ready <none> 36h v1.17.2
-
安装kubeproxy服务
重新验证集群
root@master:/opt/k8s/yml# kubectl create -f nginx.yml
service/nginx created
deployment.apps/nginx-deployment created
root@master:/opt/k8s/yml# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-56f8998dbc-6b6qm 1/1 Running 0 87s 172.30.22.2 master <none> <none>
root@master:/opt/k8s/yml# kubectl create -f busybox.yml
pod/busybox created
root@master:/opt/k8s/yml# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 102s 172.30.22.3 master <none> <none>
nginx-deployment-56f8998dbc-6b6qm 1/1 Running 0 3m20s 172.30.22.2 master <none> <none>
root@master:/opt/k8s/yml# curl http://192.168.0.107:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
root@master:/opt/k8s/yml# curl http://192.168.0.114:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
可以看到访问集群中任意一个节点的8080端口,都可以正确访问到后端对应的nginx服务