摘要:etcd 是k8s集群最重要的组件,用来存储k8s的所有服务信息, etcd 挂了,集群就挂了,我们这里把etcd部署在master三台节点上做高可用,etcd集群采用raft算法选举Leader, 由于Raft算法在做决策时需要多数节点的投票,所以etcd一般部署集群推荐奇数个节点,推荐的数量为3、5或者7个节点构成一个集群。
1)下载etcd二进制文件
etcd命令为下载的二进制文件,解压后复制到指定目录即可
[root@k8s-master01 ~]# cd k8s/ [root@k8s-master01 k8s]# wget https://github.com/etcd-io/etcd/releases/download/v3.3.12/etcd-v3.3.12-linux-amd64.tar.gz [root@k8s-master01 k8s]# tar -xf etcd-v3.3.12-linux-amd64.tar.gz [root@k8s-master01 k8s]# cd etcd-v3.3.12-linux-amd64 ##有2个文件,etcdctl是操作etcd的命令 ##把etcd二进制文件传输到三个master节点 [root@k8s-master01 ~]# ansible k8s-master -m copy -a 'src=/root/k8s/etcd-v3.3.12-linux-amd64/etcd dest=/usr/local/bin/ mode=0755' [root@k8s-master01 ~]# ansible k8s-master -m copy -a 'src=/root/k8s/etcd-v3.3.12-linux-amd64/etcdctl dest=/usr/local/bin/ mode=0755' 说明:若是不用ansible,可以直接用scp把两个文件传输到三个master节点的/usr/local/bin/目录下
2)创建etcd证书请求模板文件
[root@k8s-master01 ~]# vim /opt/k8s/certs/etcd-csr.json ##证书请求文件 { "CN": "etcd", "hosts": [ "127.0.0.1", "10.10.0.18", "10.10.0.19", "10.10.0.20" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "ShangHai", "L": "ShangHai", "O": "k8s", "OU": "System" } ] } 说明:hosts中的IP为各etcd节点IP及本地127地址,etcd的证书需要签入所有节点ip,在生产环境中hosts列表最好多预留几个IP,这样后续扩展节点或者因故障需要迁移时不需要再重新生成证书。(我生产环境使用阿里云VPC网络,所以会预留指定段的IP)
3)生成证书及私钥
注意命令中使用的证书的具体位置
[root@k8s-master01 ~]# cd /opt/k8s/certs/ [root@k8s-master01 certs]# cfssl gencert -ca=/opt/k8s/certs/ca.pem -ca-key=/opt/k8s/certs/ca-key.pem -config=/opt/k8s/certs/ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd 2019/04/22 17:17:51 [INFO] generate received request 2019/04/22 17:17:51 [INFO] received CSR 2019/04/22 17:17:51 [INFO] generating key: rsa-2048 2019/04/22 17:17:51 [INFO] encoded CSR 2019/04/22 17:17:51 [INFO] signed certificate with serial number 335217685822754469090490767964903486042452749906 2019/04/22 17:17:51 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for websites. For more information see the Baseline Requirements for the Issuance and Management of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org); specifically, section 10.2.3 ("Information Requirements").
4)查看证书
etcd.csr是签署时用到的中间文件,如果你不打算自己签署证书,而是让第三方的CA机构签署,只需要把etcd.csr文件提交给CA机构。
[root@k8s-master01 certs]# ll etcd* -rw-r--r--. 1 root root 1066 Apr 22 17:17 etcd.csr -rw-r--r--. 1 root root 293 Apr 22 17:10 etcd-csr.json -rw-------. 1 root root 1679 Apr 22 17:17 etcd-key.pem -rw-r--r--. 1 root root 1444 Apr 22 17:17 etcd.pem
5)证书分发
把生成的etcd证书复制到创建的证书目录并放至另2台etcd节点正常情况下只需要copy这三个文件即可,ca.pem(已经存在)、etcd-key.pem、etcd.pem
[root@k8s-master01 certs]# ansible k8s-master -m copy -a 'src=/opt/k8s/certs/etcd.pem dest=/etc/kubernetes/ssl/' [root@k8s-master01 certs]# ansible k8s-master -m copy -a 'src=/opt/k8s/certs/etcd-key.pem dest=/etc/kubernetes/ssl/'
6)修改etcd配置参数
为了安全性起我这里使用单独的用户启动 Etcd
##创建etcd用户和组 [root@k8s-master01 ~]# ansible k8s-master -m group -a 'name=etcd' [root@k8s-master01 ~]# ansible k8s-master -m user -a 'name=etcd group=etcd comment="etcd user" shell=/sbin/nologin home=/var/lib/etcd createhome=no' ##创建etcd数据存放目录并授权 [root@k8s-master01 ~]# ansible k8s-master -m file -a 'path=/var/lib/etcd state=directory owner=etcd group=etcd'
说明:以上步骤若是感觉比较麻烦,可以直接在对应三台master主机执行以下命令即可mkdir /etc/kubernetes/configgroupadd -r etcduseradd -r -g etcd -d /var/lib/etcd -s /sbin/nologin -c "etcd user" etcdmkdir /var/lib/etcd/chown -R etcd:etcd /var/lib/etcd/
7)配置etcd配置文件
etcd.conf配置文件信息,配置文件中涉及证书,etcd用户需要对其有可读权限,否则会提示无法获取证书,644权限即可。
[root@k8s-master01 ~]# vim /opt/k8s/cfg/etcd.conf #[member] ETCD_NAME="etcd01"
ETCD_DATA_DIR="/var/lib/etcd"
#ETCD_SNAPSHOT_COUNTER="10000" #ETCD_HEARTBEAT_INTERVAL="100" #ETCD_ELECTION_TIMEOUT="1000" ETCD_LISTEN_PEER_URLS="https://10.10.0.18:2380"
ETCD_LISTEN_CLIENT_URLS="https://10.10.0.18:2379,https://127.0.0.1:2379"
#ETCD_MAX_SNAPSHOTS="5" #ETCD_MAX_WALS="5" #ETCD_CORS="" ETCD_AUTO_COMPACTION_RETENTION="1" ETCD_QUOTA_BACKEND_BYTES="8589934592" ETCD_MAX_REQUEST_BYTES="5242880" #[cluster] ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.0.18:2380" # if you use different ETCD_NAME (e.g. test), # set ETCD_INITIAL_CLUSTER value for this name, i.e. "test=http://..." ETCD_INITIAL_CLUSTER="etcd01=https://10.10.0.18:2380,etcd02=https://10.10.0.19:2380,etcd03=https://10.10.0.20:2380"
ETCD_INITIAL_CLUSTER_STATE="new" ETCD_INITIAL_CLUSTER_TOKEN="k8s-etcd-cluster" ETCD_ADVERTISE_CLIENT_URLS="https://10.10.0.18:2379" #[security] CLIENT_CERT_AUTH="true" ETCD_CA_FILE="/etc/kubernetes/ssl/ca.pem" ETCD_CERT_FILE="/etc/kubernetes/ssl/etcd.pem" ETCD_KEY_FILE="/etc/kubernetes/ssl/etcd-key.pem" PEER_CLIENT_CERT_AUTH="true" ETCD_PEER_CA_FILE="/etc/kubernetes/ssl/ca.pem" ETCD_PEER_CERT_FILE="/etc/kubernetes/ssl/etcd.pem" ETCD_PEER_KEY_FILE="/etc/kubernetes/ssl/etcd-key.pem"
参数解释:
- ETCD_NAME:etcd节点成员名称,在一个etcd集群中必须唯一性,可使用Hostname或者machine-id
- ETCD_LISTEN_PEER_URLS:和其它成员节点间通信地址,每个节点不同,必须使用IP,使用域名无效
- ETCD_LISTEN_CLIENT_URLS:对外提供服务的地址,通常为本机节点。使用域名无效
- ETCD_INITIAL_ADVERTISE_PEER_URLS:节点监听地址,并会通告集群其它节点
- ETCD_INITIAL_CLUSTER:集群中所有节点信息,格式为:节点名称+监听的本地端口,及:ETCD_NAME:https://ETCD_INITIAL_ADVERTISE_PEER_URLS
- ETCD_ADVERTISE_CLIENT_URLS:节点成员客户端url列表,对外公告此节点客户端监听地址,可以使用域名
- ETCD_AUTO_COMPACTION_RETENTION: 在一个小时内为mvcc键值存储的自动压实保留。0表示禁用自动压缩
- ETCD_QUOTA_BACKEND_BYTES: ETCDdb存储数据大小,默认2G,推荐8G
- ETCD_MAX_REQUEST_BYTES: 事务中允许的最大操作数,默认1.5M,官方推荐10M,我这里设置5M,大家根据自己实际业务设置
由于我们是三个节点etcd集群,所以需要把etcd.conf配置文件复制到另外2个节点,并把上面参数解释中红色参数修改为对应主机IP。
分发etcd.conf配置文件,当然你不用ansible,可以直接用scp命令把配置文件传输到三台机器对应位置,然后三台机器分别修改IP、ETCD_NAME等参数。
[root@k8s-master01 config]# ansible k8s-master -m copy -a 'src=/opt/k8s/cfg/etcd.conf dest=/etc/kubernetes/config/etcd.conf' ##登陆对应主机修改配置文件,把对应IP修改为本地IP
编辑etcd.service 启动文件
[root@k8s-master01 ~]# vim /opt/k8s/unit/etcd.service
[Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify WorkingDirectory=/var/lib/etcd/ EnvironmentFile=-/etc/kubernetes/config/etcd.conf User=etcd # set GOMAXPROCS to number of processors ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/local/bin/etcd --name="${ETCD_NAME}" --data-dir="${ETCD_DATA_DIR}" --listen-client-urls="${ETCD_LISTEN_CLIENT_URLS}"" Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target [root@k8s-master01 ~]# ansible k8s-master -m copy -a 'src=/opt/k8s/unit/etcd.service dest=/usr/lib/systemd/system/etcd.service' [root@k8s-master01 ~]# ansible k8s-master -m shell -a 'systemctl daemon-reload' [root@k8s-master01 ~]# ansible k8s-master -m shell -a 'systemctl enable etcd' [root@k8s-master01 ~]# ansible k8s-master -m shell -a 'systemctl start etcd'
注:
这里需要三台etcd服务同时启动,在三台机器上同时执行启动命令,启动其中一台后,服务会卡在那里,直到集群中所有etcd节点都已启动。我这里因为是ansible远程执行,所以没有出现这个问题。
8)验证集群
etcd3版本,查看集群状态时,需要指定对应的证书位置
[root@k8s-master01 ~]# etcdctl --endpoints=https://10.10.0.18:2379,https://10.10.0.19:2379,https://10.10.0.20:2379 --cert-file=/etc/kubernetes/ssl/etcd.pem --ca-file=/etc/kubernetes/ssl/ca.pem --key-file=/etc/kubernetes/ssl/etcd-key.pem cluster-health member 804ed05b4beec304 is healthy: got healthy result from https://10.10.0.20:2379 member 8a5b84381bee52dd is healthy: got healthy result from https://10.10.0.19:2379 member caba783185460428 is healthy: got healthy result from https://10.10.0.18:2379 cluster is healthy [root@k8s-master01 ~]# etcdctl --endpoints=https://10.10.0.18:2379,https://10.10.0.19:2379,https://10.10.0.20:2379 --cert-file=/etc/kubernetes/ssl/etcd.pem --ca-file=/etc/kubernetes/ssl/ca.pem --key-file=/etc/kubernetes/ssl/etcd-key.pem member list 804ed05b4beec304: name=etcd03 peerURLs=https://10.10.0.20:2380 clientURLs=https://10.10.0.20:2379 isLeader=false 8a5b84381bee52dd: name=etcd02 peerURLs=https://10.10.0.19:2380 clientURLs=https://10.10.0.19:2379 isLeader=true caba783185460428: name=etcd01 peerURLs=https://10.10.0.18:2380 clientURLs=https://10.10.0.18:2379 isLeader=false ## 可以看到集群显示健康,并可以看到isLeader=true 所在节点