NFS高可用目的
部署NFS双机热备高可用环境,用作K8S容器集群的远程存储,实现K8S数据持久化。
NFS高可用思路
NFS + Keepalived 实现高可用,防止单点故障。
Rsync+Inotify 实现主备间共享数据进行同步。
环境准备
技术要求
- 两个NFS节点机器的配置要一致
- keepalived监控nfs进程,master的nfs主进程宕掉无法启动时由slave的nfs接管继续工作。
- k8s数据备份到slave,同时master和slave数据用rsync+inotify实时同步,保证数据完整性。
- 生产环境下,最好给NFS共享目录单独挂载一块硬盘或单独的磁盘分区。
关闭两台节点机的防火墙和Selinux
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
关闭防火墙# systemctl stop firewalld.service# systemctl disable firewalld.service# firewall-cmd --statenot running关闭selinux# cat /etc/sysconfig/selinuxSELINUX=disabled# setenforce 0# getenforceDisabled# reboot |
NFS高可用部署记录
一、安装部署NFS服务(Master和Slave两机器同样操作)
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
|
1)安装nfs# yum -y install nfs-utils2)创建nfs共享目录# mkdir /data/k8s_storage3)编辑export文件,运行k8s的node节点挂载nfs共享目录这里可以使用node节点的ip网段进行挂载配置也可以直接使用node节点的具体ip(一个ip配置一行)进行挂载配置# vim /etc/exports/data/k8s_storage 172.16.60.0/24(rw,sync,no_root_squash)4)配置生效# exportfs -r5)查看生效# exportfs6)启动rpcbind、nfs服务# systemctl restart rpcbind && systemctl enable rpcbind# systemctl restart nfs && systemctl enable nfs7)查看 RPC 服务的注册状况# rpcinfo -p localhost8)showmount测试Master节点测试# showmount -e 172.16.60.235Export list for 172.16.60.235:/data/k8s_storage 172.16.60.0/24Slave节点测试# showmount -e 172.16.60.236Export list for 172.16.60.236:/data/k8s_storage 172.16.60.0/24#############################################################################或者到ks的任意一个node节点上手动尝试挂载NFS,看是否挂载成功:[root@k8s-node01 ~]# mkdir /haha[root@k8s-node01 ~]# mount -t nfs 172.16.60.235:/data/k8s_storage /haha[root@k8s-node01 ~]# umount /haha[root@k8s-node01 ~]# mount -t nfs 172.16.60.236:/data/k8s_storage /haha[root@k8s-node01 ~]# umount /haha[root@k8s-node01 ~]# rm -rf /haha############################################################################# |
二、安装部署keepalived(Master和Slave两机器同样操作)
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
|
1)安装keepalived# yum -y install keepalived2)Master节点的keepalived.conf配置这里特别需要注意:一定要设置keepalived为非抢占模式,如果设置成抢占模式会在不断的切换主备时容易造成NFS数据丢失。# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf_bak# >/etc/keepalived/keepalived.conf# vim /etc/keepalived/keepalived.conf! Configuration File for keepalivedglobal_defs { router_id master #id可以随便设}vrrp_script chk_nfs { script "/etc/keepalived/nfs_check.sh" #监控脚本 interval 2 weight -20 #keepalived部署了两台,所以设为20,如果三台就设为30}vrrp_instance VI_1 { state BACKUP #两台主机都设为backup非抢占模式 interface eth0 #网卡名写自己的,不要照抄 virtual_router_id 51 priority 100 #master设为100,backup设为80,反正要比100小 advert_int 1 nopreempt #设置为非抢占模式必须要该参数 authentication { auth_type PASS auth_pass 1111 } track_script { chk_nfs } virtual_ipaddress { 172.16.60.244 #虚拟ip }}3)Slave节点的keepalived.conf配置只需将priority参数项修改为80,其他配置的和master节点一样,脚本也一样。# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf_bak# >/etc/keepalived/keepalived.conf# vim /etc/keepalived/keepalived.conf! Configuration File for keepalivedglobal_defs { router_id master }vrrp_script chk_nfs { script "/etc/keepalived/nfs_check.sh" interval 2 weight -20 }vrrp_instance VI_1 { state BACKUP interface eth0 virtual_router_id 51 priority 80 advert_int 1 nopreempt authentication { auth_type PASS auth_pass 1111 } track_script { chk_nfs } virtual_ipaddress { 172.16.60.244 }}4)编辑nfs_check.sh监控脚本# vim /etc/keepalived/nfs_check.sh#!/bin/bashA=`ps -C nfsd --no-header | wc -l`if [ $A -eq 0 ];then systemctl restart nfs-server.service sleep 2 if [ `ps -C nfsd --no-header| wc -l` -eq 0 ];then pkill keepalived fifi设置脚本执行权限# chmod 755 /etc/keepalived/nfs_check.sh5)启动keepalived服务# systemctl restart keepalived.service && systemctl enable keepalived.service查看服务进程是否启动# ps -ef|grep keepalived6)检查vip是否存在在两台节点机器上执行"ip addr"命令查看vip,其中会在一台机器上产生vip地址。# ip addr|grep 172.16.60.244 inet 172.16.60.244/32 scope global eth0 测试vip地址要能被ping通 # ping 172.16.60.244PING 172.16.60.244 (172.16.60.244) 56(84) bytes of data.64 bytes from 172.16.60.244: icmp_seq=1 ttl=64 time=0.063 ms64 bytes from 172.16.60.244: icmp_seq=2 ttl=64 time=0.042 ms64 bytes from 172.16.60.244: icmp_seq=3 ttl=64 time=0.077 ms7)keepalived故障测试停掉vip所在的Master节点机器上的keepalived服务后,发现vip会自动飘移到另一台Backup机器上才算测试成功。当该Master节点的keepalived服务重新启动后,vip不会重新飘移回来。因为keepalived采用了非抢占模式。如果keepalived设置为抢占模式,vip会在Master节点的keepalived重启恢复后自动飘回去,但是这样一直来回切换可能会造成NFS数据不完整,因为这里必须设置成非抢占模式。由于配置了nfs的nfs_check.sh监控脚本,所以当其中一台节点机器上的NFS服务宕停后会自动重启NFS。如果NFS服务重启失败,则会自动关闭该节点机器上的keepalived服务,如果该节点有vip则会自动飘移到另外一台节点上。 |
三、安装部署Rsync+Inofity(Master和Slave两机器都要操作)
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
|
1)安装rsync和inotify# yum -y install rsync inotify-tools2)Master节点机器配置rsyncd.conf# cp /etc/rsyncd.conf /etc/rsyncd.conf_bak# >/etc/rsyncd.conf# vim /etc/rsyncd.confuid = rootgid = rootuse chroot = 0port = 873hosts allow = 172.16.60.0/24 #允许ip访问设置,可以指定ip或ip段max connections = 0timeout = 300pid file = /var/run/rsyncd.pidlock file = /var/run/rsyncd.locklog file = /var/log/rsyncd.loglog format = %t %a %m %f %btransfer logging = yessyslog facility = local3[master_web]path = /data/k8s_storagecomment = master_webignore errorsread only = no #是否允许客户端上传文件list = noauth users = rsync #指定由空格或逗号分隔的用户名列表,只有这些用户才允许连接该模块secrets file = /etc/rsyncd.passwd #保存密码和用户名文件,需要自己生成编辑密码和用户文件(格式为"用户名:密码")# vim /etc/rsyncd.passwdrsync:123456编辑同步密码(注意这个文件和上面的密码和用户文件路径不一样)该文件内容只需要填写从服务器的密码,例如这里从服务器配的用户名密码都是rsync:123456,则主服务器则写123456一个就可以了# vim /opt/rsyncd.passwd123456设置文件执行权限# chmod 600 /etc/rsyncd.passwd# chmod 600 /opt/rsyncd.passwd启动服务# systemctl enable rsyncd && systemctl restart rsyncd检查rsync服务进程是否启动# ps -ef|grep rsync3)Slave节点机器配置rsyncd.conf就把master主机/etc/rsyncd.conf配置文件里的[master_web]改成[slave_web]其他都一样,密码文件也设为一样# cp /etc/rsyncd.conf /etc/rsyncd.conf_bak# >/etc/rsyncd.conf# vim /etc/rsyncd.confuid = rootgid = rootuse chroot = 0port = 873hosts allow = 172.16.60.0/24max connections = 0timeout = 300pid file = /var/run/rsyncd.pidlock file = /var/run/rsyncd.locklog file = /var/log/rsyncd.loglog format = %t %a %m %f %btransfer logging = yessyslog facility = local3[slave_web]path = /data/k8s_storagecomment = master_webignore errorsread only = nolist = noauth users = rsyncsecrets file = /etc/rsyncd.passwd编辑密码和用户文件(格式为"用户名:密码")# vim /etc/rsyncd.passwdrsync:123456编辑同步密码# vim /opt/rsyncd.passwd123456设置文件执行权限# chmod 600 /etc/rsyncd.passwd# chmod 600 /opt/rsyncd.passwd启动服务# systemctl enable rsyncd && systemctl restart rsyncd检查rsync服务进程是否启动# ps -ef|grep rsync |
4)手动验证下Master节点NFS数据同步到Slave节点
|
1
2
3
4
5
6
7
8
9
10
11
12
13
|
在Master节点的NFS共享目录下创建测试数据# ls /data/k8s_storage/# mkdir /data/k8s_storage/test# touch /data/k8s_storage/{a,b}# ls /data/k8s_storage/a b test手动同步Master节点的NFS共享目录数据到Slave节点的NFS共享目录下# rsync -avzp --delete /data/k8s_storage/ rsync@172.16.60.236::slave_web --password-file=/opt/rsyncd.passwd到Slave节点查看# ls /data/k8s_storage/a b test |
上面rsync同步命令说明:
- /data/k8s_storage/ 是同步的NFS共享目录
- rsync@172.16.60.236::slave_web
- rsync 是Slave节点服务器的/etc/rsyncd.passwd文件中配置的用户名
- 172.16.60.236为Slave节点服务ip
- slave_web 为Slave服务器的rsyncd.conf中配置的同步模块名
- --password-file=/opt/rsyncd.passwd 是Master节点同步到Slave节点使用的密码文件,文件中配置的是Slave节点服务器的/etc/rsyncd.passwd文件中配置的密码
5)设置Rsync+Inotify自动同步
这里需要注意:不能设置Master和Slave节点同时执行rsync自动同步,即不能同时设置双向同步。因为Master节点将数据同步到Slave节点,如果Slave节点再将数据同步回到Master节点,这个就矛盾了。所以需要确保只有一方在执行自动同步到另一方的操作。方式就是判断当前节点服务器是否存在VIP,如存在VIP则自动同步数据到另一台节点上。如不存在VIP则不执行自动同步操作。
+++++++ Master节点服务器操作 +++++++
编写自动同步脚本/opt/rsync_inotify.sh
|
1
2
3
4
5
6
7
8
9
10
11
12
13
|
#!/bin/bashhost=172.16.60.236src=/data/k8s_storage/des=slave_webpassword=/opt/rsyncd.passwduser=rsyncinotifywait=/usr/bin/inotifywait$inotifywait -mrq --timefmt '%Y%m%d %H:%M' --format '%T %w%f%e' -e modify,delete,create,attrib $src | while read files ;do rsync -avzP --delete --timeout=100 --password-file=${password} $src $user@$host::$des echo "${files} was rsynced" >>/tmp/rsync.log 2>&1done |
编写VIP监控脚本/opt/vip_monitor.sh
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
#!/bin/bashVIP_NUM=`ip addr|grep 244|wc -l`RSYNC_INOTIRY_NUM=`ps -ef|grep /usr/bin/inotifywait|grep -v grep|wc -l`if [ ${VIP_NUM} -ne 0 ];then echo "VIP在当前NFS节点服务器上" >/dev/null 2>&1 if [ ${RSYNC_INOTIRY_NUM} -ne 0 ];then echo "rsync_inotify.sh脚本已经在后台执行中" >/dev/null 2>&1 else echo "需要在后台执行rsync_inotify.sh脚本" >/dev/null 2>&1 nohup sh /opt/rsync_inotify.sh & fielse echo "VIP不在当前NFS节点服务器上" >/dev/null 2>&1 if [ ${RSYNC_INOTIRY_NUM} -ne 0 ];then echo "需要关闭后台执行的rsync_inotify.sh脚本" >/dev/null 2>&1 ps -ef|grep rsync_inotify.sh|grep -v grep|awk '{print $2}'|xargs kill -9 ps -ef|grep inotifywait|grep -v grep|awk '{print $2}'|xargs kill -9 else echo "rsync_inotify.sh脚本当前未执行" >/dev/null 2>&1 fifi |
编写持续执行脚本/opt/rsync_monit.sh
|
1
2
3
4
5
|
#!/bin/bashwhile [ "1" = "1" ]do /bin/bash -x /opt/vip_monitor.sh >/dev/null 2>&1done |
后台运行脚本
|
1
2
3
4
5
6
|
# chmod 755 /opt/rsync_inotify.sh# chmod 755 /opt/vip_monitor.sh# chmod 755 /opt/rsync_monit.sh# nohup sh /opt/rsync_inotify.sh &# nohup sh /opt/rsync_monit.sh & |
设置rsync_monit.sh脚本的开机启动
|
1
2
|
# chmod +x /etc/rc.d/rc.local# echo "nohup sh /opt/rsync_monit.sh & " >> /etc/rc.d/rc.local |
+++++++ Slave节点服务器操作 +++++++
脚本名为/opt/rsync_inotify.sh,内容如下:
|
1
2
3
4
5
6
7
8
9
10
11
12
13
|
#!/bin/bashhost=172.16.60.235src=/data/k8s_storage/des=master_webpassword=/opt/rsyncd.passwduser=rsyncinotifywait=/usr/bin/inotifywait$inotifywait -mrq --timefmt '%Y%m%d %H:%M' --format '%T %w%f%e' -e modify,delete,create,attrib $src | while read files ;do rsync -avzP --delete --timeout=100 --password-file=${password} $src $user@$host::$des echo "${files} was rsynced" >>/tmp/rsync.log 2>&1done |
编写VIP监控脚本/opt/vip_monitor.sh
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
#!/bin/bashVIP_NUM=`ip addr|grep 244|wc -l`RSYNC_INOTIRY_NUM=`ps -ef|grep /usr/bin/inotifywait|grep -v grep|wc -l`if [ ${VIP_NUM} -ne 0 ];then echo "VIP在当前NFS节点服务器上" >/dev/null 2>&1 if [ ${RSYNC_INOTIRY_NUM} -ne 0 ];then echo "rsync_inotify.sh脚本已经在后台执行中" >/dev/null 2>&1 else echo "需要在后台执行rsync_inotify.sh脚本" >/dev/null 2>&1 nohup sh /opt/rsync_inotify.sh & fielse echo "VIP不在当前NFS节点服务器上" >/dev/null 2>&1 if [ ${RSYNC_INOTIRY_NUM} -ne 0 ];then echo "需要关闭后台执行的rsync_inotify.sh脚本" >/dev/null 2>&1 ps -ef|grep rsync_inotify.sh|grep -v grep|awk '{print $2}'|xargs kill -9 ps -ef|grep inotifywait|grep -v grep|awk '{print $2}'|xargs kill -9 else echo "rsync_inotify.sh脚本当前未执行" >/dev/null 2>&1 fifi |
编写持续执行脚本/opt/rsync_monit.sh
|
1
2
3
4
5
|
#!/bin/bashwhile [ "1" = "1" ]do /bin/bash -x /opt/vip_monitor.sh >/dev/null 2>&1done |
后台运行脚本 (只执行rsync_monit.sh)
|
1
2
3
4
5
|
# chmod 755 /opt/rsync_inotify.sh# chmod 755 /opt/vip_monitor.sh# chmod 755 /opt/rsync_monit.sh# nohup sh /opt/rsync_monit.sh & |
设置rsync_monit.sh脚本的开机启动
|
1
2
|
# chmod +x /etc/rc.d/rc.local# echo "nohup sh /opt/rsync_monit.sh & " >> /etc/rc.d/rc.local |
6)最后验证下自动同步
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
1)比如当前VIP在Master节点,在Master节点创建测试数据,观察是否自动同步到Slave节点# ip addr|grep 172.16.60.244 inet 172.16.60.244/32 scope global eth0# rm -rf /data/k8s_storage/*# echo "test" > /data/k8s_storage/haha# ls /data/k8s_storage/haha到Slave节点上查看,已自动同步过来# ls /data/k8s_storage/haha# cat /data/k8s_storage/hahatest2)接着关闭Master节点的keeplived,将VIP飘移到Slave节点# systemctl stop keepalived# ip addr|grep 172.16.60.244到Slave节点上查看,发现VIP已经飘移过来了# ip addr|grep 172.16.60.244 inet 172.16.60.244/32 scope global eth0在Slave节点创建测试数据,观察是否自动同步到Master节点# rm -rf /data/k8s_storage/*# mkdir /data/k8s_storage/cha# echo "heihei" > /data/k8s_storage/you到Master节点查看,发现数据已经同步过来了# ls /data/k8s_storage/cha heihei3)模拟Master节点和Slave节点关机,观察开机后:/opt/rsync_monit.sh脚本会实现开机自启动。按照上面Master和Slave节点的自动同步验证OK。 |