k8s部署文档
一、文档简介
作者:lanjx
邮箱:lanheader@163.com
博客地址:https://www.cnblogs.com/lanheader/
更新时间:2021-07-09
二、使用kubeadm部署文档
注意:所有执行无特殊说明都需要在所有节点(k8s-master 和 k8s-node)上执行
1、环境准备
准备三台主机(根据自己的情况进行设置)
192.168.8.158 master
192.168.8.159 node1
192.168.8.160 node2
1.1、主机名设置
hostname master
hostname node1
hostname node2
1.2、关闭防火墙
$ systemctl stop firewalld.service
$ systemctl disable firewalld.service
$ yum upgrade
1.3、关闭swap
注意:kubernetes1.8开始不关闭swap无法启动
$ swapoff -a
$ cp /etc/fstab /etc/fstab_bak
$ cat /etc/fstab_bak |grep -v swap > /etc/fstab
$ cat /etc/fstab
# /etc/fstab
# Created by anaconda on Tue Jul 21 11:51:16 2020
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos_virtual--machine-root / xfs defaults 0 0
UUID=1694f89b-5c62-4a4a-9c86-46c3f202e4f6 /boot xfs defaults 0 0
/dev/mapper/centos_virtual--machine-home /home xfs defaults 0 0
#/dev/mapper/centos_virtual--machine-swap swap swap defaults 0 0
1.4、修改iptables参数
RHEL / CentOS 7上的一些用户报告了由于iptables被绕过而导致流量路由不正确的问题。创建/etc/sysctl.d/k8s.conf文件,添加如下内容:
$ cat <<EOF > /etc/sysctl.d/k8s.conf
vm.swappiness = 0
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
使配置生效
$ modprobe br_netfilter
$ sysctl -p /etc/sysctl.d/k8s.conf
1.5、加载ipvs模块
$cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
这条命令有点长
$ chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
1.6、安装docker
# 卸载旧版 docker
$ docker stop `docker ps -a -q`
$ docker rm `docker ps -a -q`
$ docker rmi -f `docker images -a -q` //这里将会强制删除
# 移除旧版本的软件信息
$ yum -y remove docker docker-common container-selinux
# 设置最新稳定版本的Docker仓库
$ yum-config-manager
--add-repo
https://docs.docker.com/v1.13/engine/installation/linux/repo_files/centos/docker.repo
# 安装Docker
# 更新yum源
$ yum makecache fast
# 选择你要的Docker版本
$ yum list docker-engine.x86_64 --showduplicates |sort -r
$ yum -y install docker-engine-<VERSION_STRING>
$ docker -v
# 启动
$ systemctl start docker
$ systemctl enable docker
# 卸载
$ yum -y remove docker-engine docker-engine-selinux
1.7、创建共享存储
如果选择使用nfs-server执行以下步骤
注意:
/data/k8s *(rw,sync,no_root_squash)执行这步,如果没有no_root_squash,pod启动会报错没有权限
# 安装nfs组件
$ yum -y install nfs-utils rpcbind
# 创建nfs路径
$ mkdir -p /data/k8s/
# 配置路径权限
$ chmod 755 /data/k8s/
# 配置nfs参数
$ vim /etc/exports
/data/k8s *(rw,sync,no_root_squash)
# 启动服务
$ systemctl start rpcbind.service
$ systemctl enable rpcbind
$ systemctl status rpcbind
$ systemctl start nfs.service
$ systemctl enable nfs
$ systemctl status nfs
# 分别在node服务器执行挂载
$ showmount -e 192.168.1.109
2、helm安装
2.1、安装
我们可以在Helm Realese页面下载二进制文件,这里下载的v2.10.0版本,解压后将可执行文件helm
拷贝到/usr/local/bin
目录下授权755即可,这样Helm
客户端就在这台机器上安装完成了。
2.2、验证
可以使用Helm
命令查看版本,会提示无法连接到服务端Tiller
:
$ helm version
Client: &version.Version{SemVer:"v2.10.0", GitCommit:"9ad53aac42165a5fadc6c87be0dea6b115f93090", GitTreeState:"clean"}
Error: could not find tiller
注意:要安装 Helm 的服务端程序,我们需要使用到
kubectl
工具,所以先确保kubectl
工具能够正常的访问 kubernetes 集群的apiserver
哦。
然后我们在命令行中执行初始化操作:
helm init
由于 Helm 默认会去
gcr.io
拉取镜像,所以如果你当前执行的机器没有配置访问国外的话可以实现下面的命令代替:
$ helm init --upgrade --tiller-image cnych/tiller:v2.10.0
$HELM_HOME has been configured at /root/.helm.
Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
To prevent this, run `helm init` with the --tiller-tls-verify flag.
For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
Happy Helming!
2.3、helm server镜像地址
修改helm server镜像地址
$ kubectl edit deployment tiller-deploy -n kube-system
替换
...
spec:
automountServiceAccountToken: true
containers:
- env:
- name: TILLER_NAMESPACE
value: kube-system
- name: TILLER_HISTORY_MAX
value: "0"
############################################# server镜像地址 #################################
image: registry.cn-hangzhou.aliyuncs.com/hlc-k8s-gcr-io/tiller:v2.16.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /liveness
port: 44135
scheme: HTTP
initialDelaySeconds: 1
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: tiller
...
地址:registry.cn-hangzhou.aliyuncs.com/hlc-k8s-gcr-io/tiller:v2.16.0
3、用kubeadm 部署 kubernetes
3.1、安装kubeadm, kubelet
注意:yum install 安装的时候一定要看一下kubernetes的版本号后面kubeadm init 的时候需要用到
$ cat <<EOF > /etc/yum.repos.d/kubernetes.repo
# 结果
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
exclude=kube*
EOF
> **注意:这里一定要看一下版本号,因为 Kubeadm init 的时候 填写的版本号不能低于kuberenete版本(安装过程中会有显示)**
```shell
$ yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
注意:如果需要指定版本 用下面的命令kubelet-
$ yum install kubelet-1.19.2 kubeadm-1.19.2 kubectl-1.19.2 --disableexcludes=Kubernetes
3.2、启动 kubelet
$ systemctl enable kubelet.service && systemctl start kubelet.service
启动kubelet.service之后 我们查看一下kubelet状态是未启动状态,查看原因发现是 “/var/lib/kubelet/config.yaml”文件不存在,这里可以暂时先不用处理,当kubeadm init 之后会创建此文件
我们在 k8s-master上用kubeadm ini初始化kubernetes
注意:这里的kubernetes-version 一定要和上面安装的版本号一致 否则会报错
kubeadm init
--apiserver-advertise-address=192.168.8.158
--image-repository registry.aliyuncs.com/google_containers
--kubernetes-version v1.21.2
--pod-network-cidr=10.244.0.0/16
--apiserver-advertise-addres # 填写 k8s-master ip
--image-repository # 镜像地址
--kubernetes-version #关闭版本探测,因为它的默认值是stable-1,会从https://storage.googleapis.com/kubernetes-release/release/stable-1.txt下载最新的版本号,指定版本跳过网络请求,再次强调一定要和Kubernetes版本号一致
kubeadm init 初始化信息, 我们看一下初始化过程发现自动创建了 "/var/lib/kubelet/config.yaml" 这个文件 (由于node 节点不需要执行kubeadm init 所以手动拷贝这个文件到节点/var/lib/kubelet/config.yaml)
[init] Using Kubernetes version: v1.13.1
[preflight] Running pre-flight checks
...
certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
#======这里是用时再使用集群之前需要执行的操作------
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
#=====这是增加节点的方法 token过期 请参考问题集锦------
kubeadm join 10.211.55.6:6443 --token sfaff2.iet15233unw5jzql --discovery-token-ca-cert-hash sha256:f798c5be53416ca3b5c7475ee0a4199eb26f9e31ee7106699729c0660a70f8d7
初始化成功后会提示在使用之前需要再配置一下,配置方法已经给出,另外会生成一个临时token以及增加节点的方法
普通用户要使用k8s 需要执行下面操作:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
如果是root 可以直接执行
export KUBECONFIG=/etc/kubernetes/admin.conf
以上两个二选一即可,这里我是直接用的root 所以直接执行
export KUBECONFIG=/etc/kubernetes/admin.conf
现在我们查看一下 kubelet 的状态 已经是 running 状态 ,启动成功
查看状态,确认每个 组件都是 Healthy 状态
kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health": "true"}
查看node状态
kubectl get node
NAME STATUS ROLES AGE VERSION
centos NotReady master 11m v1.19.2
安装完k8s集群之后很可能会出现一下情况:
$ kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
etcd-0 Healthy {"health":"true"}
出现这种情况是kube-controller-manager.yaml和kube-scheduler.yaml设置的默认端口是0,在文件中注释掉就可以了。(每台master节点都要执行操作)
1.修改kube-scheduler.yaml文件
注释 - --port=0
vim /etc/kubernetes/manifests/kube-scheduler.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
# - --port=0 ## 注释掉这行
image: k8s.gcr.io/kube-scheduler:v1.18.6
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10259
scheme: HTTPS
initialDelaySeconds: 15
timeoutSeconds: 15
name: kube-scheduler
resources:
requests:
cpu: 100m
volumeMounts:
- mountPath: /etc/kubernetes/scheduler.conf
name: kubeconfig
readOnly: true
hostNetwork: true
priorityClassName: system-cluster-critical
volumes:
- hostPath:
path: /etc/kubernetes/scheduler.conf
type: FileOrCreate
name: kubeconfig
status: {}
2.修改kube-controller-manager.yaml文件
vim /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --allocate-node-cidrs=true
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=127.0.0.1
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --cluster-cidr=10.244.0.0/16
- --cluster-name=kubernetes
- --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
- --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
- --controllers=*,bootstrapsigner,tokencleaner
- --kubeconfig=/etc/kubernetes/controller-manager.conf
- --leader-elect=true
- --node-cidr-mask-size=24
# - --port=0 ## 注释掉这行
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --root-ca-file=/etc/kubernetes/pki/ca.crt
- --service-account-private-key-file=/etc/kubernetes/pki/sa.key
- --service-cluster-ip-range=10.96.0.0/12
- --use-service-account-credentials=true
image: k8s.gcr.io/kube-controller-manager:v1.18.6
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10257
scheme: HTTPS
initialDelaySeconds: 15
timeoutSeconds: 15
name: kube-controller-manager
resources:
requests:
cpu: 200m
volumeMounts:
- mountPath: /etc/ssl/certs
name: ca-certs
readOnly: true
- mountPath: /etc/pki
name: etc-pki
readOnly: true
- mountPath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
name: flexvolume-dir
- mountPath: /etc/kubernetes/pki
name: k8s-certs
readOnly: true
- mountPath: /etc/kubernetes/controller-manager.conf
name: kubeconfig
readOnly: true
hostNetwork: true
priorityClassName: system-cluster-critical
volumes:
- hostPath:
path: /etc/ssl/certs
type: DirectoryOrCreate
name: ca-certs
- hostPath:
path: /etc/pki
type: DirectoryOrCreate
name: etc-pki
- hostPath:
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
type: DirectoryOrCreate
name: flexvolume-dir
- hostPath:
path: /etc/kubernetes/pki
type: DirectoryOrCreate
name: k8s-certs
- hostPath:
path: /etc/kubernetes/controller-manager.conf
type: FileOrCreate
name: kubeconfig
status: {}
3.每台master重启kubelet
$ systemctl restart kubelet.service
4.再次查看状态
$ kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
3.3、安装port Network( flannel )
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
如果访问不了,自己想办法。。。。
3.4、安装storageClass
$ git clone https://github.com/helm/charts.git
$ cd charts/
$ helm install stable/nfs-client-provisioner --set nfs.server=192.168.1.109 --set nfs.path=/data/k8s
注意:地址下载不下来想办法。。。。
3.5、K8s 补全命令:
$ yum install -y bash-completion
$ source /usr/share/bash-completion/bash_completion
$ source <(kubectl completion bash)
$ echo "source <(kubectl completion bash)" >> ~/.bashrc
3.6、安装部署时出现的问题
3.6.1 集群DNS组件拉取问题:
Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.13-0
failed to pull image "registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns:v1.8.0": output: Error response from daemon: pull access denied for registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
, error: exit status 1
原因:
kubernetes v1.21.1 安装时需要从 k8s.gcr.io 拉取镜像,但是该网站被我国屏蔽了,国内没法正常访问导致没法正常安装。
这里通过介绍从Docker官方默认镜像平台拉取镜像并重新打tag的方式来绕过对 k8s.gcr.io 的访问。
解决方案:
手动下载镜像
$ docker pull coredns/coredns
查看kubeadm需要镜像,并修改名称
$ kubeadm config images list --config new.yaml
查看镜像
$ docker images
打标签,修改名称
$ docker tag coredns/coredns:latest registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns:v1.8.0
删除多余镜像
$ docker rmi coredns/coredns:latest
3.6.2、kubelet 启动不了
查看kubelet状态
systemctl status kubelet.service
输出如下:
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset:disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since 日 2019-03-31 16:18:55 CST;7s ago
Docs: https://kubernetes.io/docs/
Process: 4564 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
Main PID: 4564 (code=exited, status=255)
3月 31 16:18:55 k8s-node systemd[1]: Unit kubelet.service entered failed state.
3月 31 16:18:55 k8s-node systemd[1]: kubelet.service failed.
查看出错信息
journalctl -xefu kubelet
3月 31 16:19:46 k8s-node systemd[1]: kubelet.service holdoff time over,scheduling restart.
3月 31 16:19:46 k8s-node systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- Unit kubelet.service has finished shutting down.
3月 31 16:19:46 k8s-node systemd[1]: Started kubelet: The Kubernetes Node Agent
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- Unit kubelet.service has finished starting up.
-- The start-up result is done.
######注意以下报错内容:
3月 31 16:19:46 k8s-node kubelet[4611]: F0331 16:19:46.989588 4611 server.go:193] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory
3月 31 16:19:46 k8s-node systemd[1]: kubelet.service: main process exited,code=exited,status=255/n/a
#######
3月 31 16:19:46 k8s-node systemd[1]: Unit kubelet.service entered failed state.
3月 31 16:19:46 k8s-node systemd[1]: kubelet.service failed.
报错/var/lib/kubelet/config.yaml不存在,执行3.2初始化操作
三、安装rancher
注意:云服务器使用nodeport方式,svc启动之后修改为nodeport。
1、 创建 Namespace
$ kubectl create namespace cattle-system
2、安装 cert-manager
# 安装 CustomResourceDefinition 资源
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.0/cert-manager.crds.yaml
# **重要:**# 如果您正在运行 Kubernetes v1.15 或更低版本,
# 则需要在上方的 kubectl apply 命令中添加`--validate=false`标志,
# 否则您将在 cert-manager 的 CustomResourceDefinition 资源中收到与
# x-kubernetes-preserve-unknown-fields 字段有关的验证错误。
# 这是一个良性错误,是由于 kubectl 执行资源验证的方式造成的。
# 为 cert-manager 创建命名空间
kubectl create namespace cert-manager
# 添加 Jetstack Helm 仓库
helm repo add jetstack https://charts.jetstack.io
# 更新本地 Helm chart 仓库缓存
helm repo update
# 安装 cert-manager Helm chart
helm install
cert-manager jetstack/cert-manager
--namespace cert-manager
--version v0.15.0
4、安装rancher
# 添加rancher源
$ helm repo add rancher https://releases.rancher.com/server-charts/
# 安装最新版rancher
$ helm install rancher rancher-stable/rancher –namespace cattle-system