5、部署PCS
5.1 安装pcs+pacemaker+corosync (controller1、controller2和 controller3)
所有控制节点安装pcs、pacemaker、corosync, pacemaker是资源管理器,corosync提供心跳机制。
[root@controller1:/root]# yum install -y lvm2 cifs-utils quota psmisc pcs pacemaker corosync fence-agents-all resource-agents crmsh [root@controller2:/root]# yum install -y lvm2 cifs-utils quota psmisc pcs pacemaker corosync fence-agents-all resource-agents crmsh [root@controller3:/root]# yum install -y lvm2 cifs-utils quota psmisc pcs pacemaker corosync fence-agents-all resource-agents crmsh [root@controller1:/root]# systemctl enable pcsd corosync [root@controller2:/root]# systemctl enable pcsd corosync [root@controller3:/root]# systemctl enable pcsd corosync [root@controller1:/root]# systemctl start pcsd && systemctl status pcsd [root@controller2:/root]# systemctl start pcsd && systemctl status pcsd [root@controller3:/root]# systemctl start pcsd && systemctl status pcsd
5.2 设置集群密码,而且三个节点密码需一直为:pcs123456
[root@controller1:/root]# echo "pcs123456" |passwd --stdin hacluster [root@controller2:/root]# echo "pcs123456" |passwd --stdin hacluster [root@controller3:/root]# echo "pcs123456" |passwd --stdin hacluster
5.3 控制节点创建配置文件corosync.conf
[root@controller2:/root]# cat <<EOF>/etc/corosync/corosync.conf totem { version: 2 secauth:off cluster_name:openstack-cluster transport:udpu } nodelist { node { ring0_addr:controller1 nodeid:1 } node { ring0_addr:controller2 nodeid:2 } node { ring0_addr:controller3 nodeid:3 } } logging { to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes debug: off } quorum { provider: corosync_votequorum } EOF [root@controller2:/root]# scp /etc/corosync/corosync.conf controller1:/etc/corosync/ [root@controller2:/root]# scp /etc/corosync/corosync.conf controller3:/etc/corosync/
5.4 配置集群,设置集群互相认证
ssh-keygen ssh-copy-id controller1 ssh-copy-id controller2 ssh-copy-id controller3
5.5 配置节点认证
[root@controller2:/root]# pcs cluster auth controller1 controller2 controller3 -u hacluster -p"pcs123456" controller2: Authorized controller3: Authorized controller1: Authorized pcs cluster auth controller1 controller2 -u hacluster -p {password} {password}表示为刚才设置的密码
5.6 创建集群
[root@controller2:/root]# pcs cluster setup --force --name openstack-cluster controller1 controller2 controller3 Destroying cluster on nodes: controller1, controller2, controller3... controller2: Stopping Cluster (pacemaker)... controller3: Stopping Cluster (pacemaker)... controller1: Stopping Cluster (pacemaker)... controller1: Successfully destroyed cluster controller2: Successfully destroyed cluster controller3: Successfully destroyed cluster Sending 'pacemaker_remote authkey' to 'controller1', 'controller2', 'controller3' controller1: successful distribution of the file 'pacemaker_remote authkey' controller3: successful distribution of the file 'pacemaker_remote authkey' controller2: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... controller1: Succeeded controller2: Succeeded controller3: Succeeded Synchronizing pcsd certificates on nodes controller1, controller2, controller3... controller2: Success controller3: Success controller1: Success Restarting pcsd on the nodes in order to reload the certificates... controller2: Success controller3: Success controller1: Success
5.7 启动集群并查看集群状态
[root@controller2:/root]# pcs cluster enable --all controller1: Cluster Enabled controller2: Cluster Enabled controller3: Cluster Enabled [root@controller2:/root]# pcs cluster start --all controller1: Starting Cluster (corosync)... controller2: Starting Cluster (corosync)... controller3: Starting Cluster (corosync)... controller1: Starting Cluster (pacemaker)... controller3: Starting Cluster (pacemaker)... controller2: Starting Cluster (pacemaker)... [root@controller2:/root]# pcs cluster status Cluster Status: Stack: corosync Current DC: controller3 (version 1.1.20-5.el7_7.2-3c4c782f70) - partition with quorum Last updated: Wed Aug 5 15:21:16 2020 Last change: Wed Aug 5 15:20:59 2020 by hacluster via crmd on controller3 3 nodes configured 0 resources configured PCSD Status: controller2: Online controller3: Online controller1: Online [root@controller2:/root]# ps aux | grep pacemaker root 15586 0.0 0.0 132972 8700 ? Ss 15:20 0:00 /usr/sbin/pacemakerd -f haclust+ 15587 0.1 0.0 136244 14620 ? Ss 15:20 0:00 /usr/libexec/pacemaker/cib root 15588 0.0 0.0 136064 7664 ? Ss 15:20 0:00 /usr/libexec/pacemaker/stonithd root 15589 0.0 0.0 98836 4372 ? Ss 15:20 0:00 /usr/libexec/pacemaker/lrmd haclust+ 15590 0.0 0.0 128068 6620 ? Ss 15:20 0:00 /usr/libexec/pacemaker/attrd haclust+ 15591 0.0 0.0 80508 3500 ? Ss 15:20 0:00 /usr/libexec/pacemaker/pengine haclust+ 15592 0.0 0.0 140380 8260 ? Ss 15:20 0:00 /usr/libexec/pacemaker/crmd root 15632 0.0 0.0 112712 960 pts/0 S+ 15:21 0:00 grep --color=auto pacemaker
5.8 配置集群
三个节点都在线
默认的表决规则建议集群中的节点个数为奇数且不低于3。当集群只有2个节点,其中1个节点崩坏,由于不符合默认的表决规则,集群资源不发生转移,集群整体仍不可用。no-quorum-policy="ignore"可以解决此双节点的问题,但不要用于生产环境。换句话说,生产环境还是至少要3节点。
pe-warn-series-max、pe-input-series-max、pe-error-series-max代表日志深度。
Virtual-recheck-interval是节点重新检查的频率。
[root@controller1 ~]# pcs property set pe-warn-series-max=1000 pe-input-series-max=1000 pe-error-series-max=1000 Virtual-recheck-interval=5min
禁用stonith:
stonith是一种能够接受指令断电的物理设备,环境无此设备,如果不关闭该选项,执行pcs命令总是含其报错信息。
[root@controller1 ~]# pcs property set stonith-enabled=false
二个节点时,忽略节点quorum功能:
[root@controller1 ~]# pcs property set no-quorum-policy=ignore
验证集群配置信息
[root@controller1 ~]# crm_verify -L -V
为集群配置虚拟 ip
[root@controller1 ~]# pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip="192.168.110.120" cidr_netmask=32 nic=ens160 op monitor interval=30s
到此,Pacemaker+corosync 是为 haproxy服务的,添加haproxy资源到pacemaker集群
[root@controller1 ~]# pcs resource create lb-haproxy systemd:haproxy --clone
说明:创建克隆资源,克隆的资源会在全部节点启动。这里haproxy会在三个节点自动启动。
查看Pacemaker资源情况
[root@controller1 ~]# pcs resource VirtualIP (ocf::heartbeat:IPaddr2): Started controller1 # 心跳的资源绑定在第三个节点的 Clone Set: lb-haproxy-clone [lb-haproxy] # haproxy克隆资源 Started: [ controller1 controller2 controller3 ]
注意:这里一定要进行资源绑定,否则每个节点都会启动haproxy,造成访问混乱
将这两个资源绑定到同一个节点上
[root@controller1 ~]# pcs constraint colocation add lb-haproxy-clone VirtualIP INFINITY
绑定成功
[root@controller1 ~]# pcs resource VirtualIP (ocf::heartbeat:IPaddr2): Started controller3 Clone Set: lb-haproxy-clone [lb-haproxy] Started: [ controller1] Stopped: [ controller2 controller3 ]
配置资源的启动顺序,先启动vip,然后haproxy再启动,因为haproxy是监听到vip
[root@controller1 ~]# pcs constraint order VirtualIP then lb-haproxy-clone pcs resource create haproxy systemd:haproxy op monitor interval="5s" pcs constraint colocation add VirtualIP haproxy INFINITY pcs constraint order VirtualIP then haproxy pcs resource restart haproxy
手动指定资源到某个默认节点,因为两个资源绑定关系,移动一个资源,另一个资源自动转移。
[root@controller1 ~]# pcs constraint location VirtualIP prefers controller1 [root@controller1 ~]# pcs resource VirtualIP (ocf::heartbeat:IPaddr2): Started controller1 Clone Set: lb-haproxy-clone [lb-haproxy] Started: [ controller1 ] Stopped: [ controller2 controller3 ] [root@controller1 ~]# pcs resource defaults resource-stickiness=100 # 设置资源粘性,防止自动切回造成集群不稳定
现在vip已经绑定到controller1节点
[root@controller1 ~]# ip a | grep global inet 192.168.110.121/24 brd 192.168.0.255 scope global ens160 inet 192.168.110.120/32 brd 192.168.0.255 scope global ens160 inet 192.168.114.121/24 brd 192.168.114.255 scope global ens192