写的很详细 理论知识:
https://www.cnblogs.com/kevingrace/p/5740940.html
写的很详细 负载:
https://www.cnblogs.com/kevingrace/p/5740953.html
实际操作如下:
1、修改主机名
11.11.11.2 主服务器 主机名:Primary 11.11.11.3 备服务器 主机名:Secondary
11.11.11.4 VIP
******主要操作上边两台设备,下边两台用于辅助测试******
11.11.11.8 web服务器 IIS (用于挂载测试) 11.11.11.9 nginx代理 设置缓存 主机名LB
2、两台机器的防火墙要相互允许访问。最好是关闭selinux和iptables防火墙(两台机器同样操作)
[root@Primary ~]# setenforce 0 //临时性关闭;永久关闭的话,需要修改/etc/sysconfig/selinux的SELINUX为disabled [root@Primary ~]# /etc/init.d/iptables stop
3、设置hosts文件(两台机器同样操作)
[root@primary ~]# cat /etc/hosts ********** 11.11.11.2 primary 11.11.11.3 secondary
4、两台机器同步时间
[root@Primary ~]# yum install -y netpdate [root@Primary ~]# ntpdate -u ntp1.aliyun.com
5、DRBD的安装(两台机器上同样操作)
rpm -ivh https://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm yum list drbd* yum install -y drbd84-utils kmod-drbd84 加载模块: [root@Primary ~]# modprobe drbd 查看模块是否已加上 [root@Primary ~]# lsmod |grep drbd drbd 332493 0
6、DRBD配置(两台机器上同样操作)
[root@primary ~]# cat /etc/drbd.conf # You can find an example in /usr/share/doc/drbd.../drbd.conf.example include "drbd.d/global_common.conf"; //这是主要的两个配置文件,这个定义软件策略 include "drbd.d/*.res"; //定义使用磁盘
[root@primary ~]# cp /etc/drbd.d/global_common.conf /etc/drbd.d/global_common.conf.bak
[root@primary ~]# cat /etc/drbd.d/global_common.conf global { usage-count no; udev-always-use-vnr; } common { protocol C; handlers { } startup { wfc-timeout 120; degr-wfc-timeout 120; outdated-wfc-timeout 120; } disk { on-io-error detach; } net { cram-hmac-alg md5; sndbuf-size 512k; shared-secret "testdrbd"; } syncer { rate 300M; } } [root@primary ~]#
[root@primary ~]# cat /etc/drbd.d/r0.res resource r0 { on primary { device /dev/drbd0; disk /dev/sdb1; //这个磁盘需要分过区不需要格式化 address 11.11.11.2:7789; meta-disk internal; } on secondary { device /dev/drbd0; disk /dev/sdb1; address 11.11.11.3:7789; meta-disk internal; } } [root@primary ~]#
7、在两台机器上添加DRBD磁盘
在Primary机器上添加一块30G的硬盘作为DRBD,分区为/dev/sdb1,不做格式化,并在本地系统创建/data目录,不做挂载操作。 [root@Primary ~]# fdisk -l ...... [root@Primary ~]# fdisk /dev/sdb 依次输入"n->p->1->1->回车->w" //分区创建后,再次使用"fdisk /dev/vdd",输入p,即可查看到创建的分区,比如/dev/vdd1 在Secondary机器上添加一块30G的硬盘作为DRBD,分区为/dev/sdb1,不做格式化,并在本地系统创建/data目录,不做挂载操作。 [root@Secondary ~]# fdisk -l ...... [root@Secondary ~]# fdisk /dev/sdb 依次输入"n->p->1->1->回车->w"
8、在两台机器上分别创建DRBD设备并激活r0资源(下面操作在两台机器上都要执行)
[root@Primary ~]# mknod /dev/drbd0 b 147 0 mknod: `/dev/drbd0': File exists [root@Primary ~]# drbdadm create-md r0 Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. [root@Primary ~]# drbdadm create-md r0 You want me to create a v08 style flexible-size internal meta data block. There appears to be a v08 flexible-size internal meta data block already in place on /dev/vdd1 at byte offset 10737340416 Do you really want to overwrite the existing v08 meta-data? [need to type 'yes' to confirm] yes //这里输入"yes" Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. 启动drbd服务(注意:需要主从共同启动方能生效) service drbd start
查看状态(两台机器上都执行查看)
[root@primary ~]# cat /proc/drbd
version: 8.4.11-1 (api:1/proto:86-101)
GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-11-03 01:26:55
0: cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent C r----- //显示都为 备用;数据还在同步,同步完成后把第一个提升为主
ns:3472496 nr:0 dw:0 dr:3472496 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1594256
[============>.......] sync'ed: 68.6% (1556/4948)M
finish: 0:00:55 speed: 28,840 (27,776) K/sec
[root@primary ~]#
由上面两台主机的DRBD状态查看结果里的ro:Secondary
/Secondary
表示两台主机的状态都是备机状态,ds是磁盘状态,显示的状态内容为“不一致”,这是因为DRBD无法判断哪一方为主机,
应以哪一方的磁盘数据作为标准。
9、接着将Primary主机配置为DRBD的主节点
[root@Primary ~]# drbdsetup /dev/drbd0 primary --force 分别查看主从DRBD状态: [root@primary ~]# cat /proc/drbd version: 8.4.11-1 (api:1/proto:86-101) GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-11-03 01:26:55 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:5066752 nr:0 dw:0 dr:5068840 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root@secondary ~]# cat /proc/drbd version: 8.4.11-1 (api:1/proto:86-101) GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-11-03 01:26:55 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:5066752 dw:5066752 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root@secondary ~]# ro在主从服务器上分别显示 Primary/Secondary和Secondary/Primary ds显示UpToDate/UpToDate 表示主从配置成功
10、挂载DRBD (Primary主节点机器上操作)
从上面Primary主节点的DRBD状态上看到mounted和fstype参数为空,所以这步开始挂载DRBD到系统目录
先格式化/dev/drbd0 [root@Primary ~]# mkfs.ext4 /dev/drbd0 创建挂载目录,然后执行DRBD挂载 [root@Primary ~]# mkdir /data [root@Primary ~]# mount /dev/drbd0 /data [root@primary ~]# df -h 文件系统 容量 已用 可用 已用% 挂载点 /dev/mapper/centos-root 17G 1.4G 16G 8% / devtmpfs 899M 0 899M 0% /dev /dev/sda1 1014M 145M 870M 15% /boot /dev/drbd0 30G 1.7G 27G 6% /data 特别注意: Secondary节点上不允许对DRBD设备进行任何操作,包括只读,所有的读写操作只能在Primary节点上进行。 只有当Primary节点挂掉时,Secondary节点才能提升为Primary节点
11、DRBD主备故障切换测试
模拟Primary节点发生故障,Secondary接管并提升为Primary
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
下面是在Primary主节点上操作记录: [root@Primary ~]# cd /data [root@Primary data]# touch wangshibo wangshibo1 wangshibo2 wangshibo3 [root@Primary data]# cd ../ [root@Primary /]# umount /data [root@Primary /]# drbdsetup /dev/drbd0 secondary //将Primary主机设置为DRBD的备节点。在实际生产环境中,直接在Secondary主机上提权(即设置为主节点)即可。 [root@Primary /]# /etc/init.d/drbd status drbd driver loaded OK; device status: version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37 m:res cs ro ds p mounted fstype 0:r0 Connected Secondary/Secondary UpToDate/UpToDate C 注意:这里实际生产环境若Primary主节点宕机,在Secondary状态信息中ro的值会显示为Secondary/Unknown,只需要进行DRBD提权操作即可。 下面是在Secondary 备份节点上操作记录: 先进行提权操作,即将Secondary手动升级为DRBD的主节点 [root@Secondary ~]# drbdsetup /dev/drbd0 primary [root@Secondary ~]# /etc/init.d/drbd status drbd driver loaded OK; device status: version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Secondary UpToDate/UpToDate C 然后挂载DRBD [root@Secondary ~]# mkdir /data [root@Secondary ~]# mount /dev/drbd0 /data [root@Secondary ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 156G 13G 135G 9% / tmpfs 2.9G 0 2.9G 0% /dev/shm /dev/vda1 190M 89M 92M 50% /boot /dev/vdd 9.8G 23M 9.2G 1% /data2 /dev/drbd0 9.8G 23M 9.2G 1% /data 发现DRBD挂载目录下已经有了之前在远程Primary主机上写入的内容 [root@Secondary ~]# cd /data [root@Secondary data]# ls wangshibo wangshibo1 wangshibo2 wangshibo3 在Secondary节点上继续写入数据 [root@Secondary data]# touch huanqiu huanqiu1 huanqiu2 huanqiu3 然后模拟Secondary节点故障,Primary节点再提权升级为DRBD主节点(操作同上,此处省略.......)
到此,DRBD的主从环境的部署工作已经完成。不过上面是记录的是主备手动切换,至于保证DRBD主从结构的智能切换,实现高可用,还需里用到Keepalived或Heartbeat来实现了(会在DRBD主端挂掉的情况下,自动切换从端为主端并自动挂载/data分区)
12、安装nfs
思路: 1)在两台机器上安装keepalived,VIP为192.168.1.200 2)将DRBD的挂载目录/data作为NFS的挂载目录。远程客户机使用vip地址挂载NFS 3)当Primary主机发生宕机或NFS挂了的故障时,Secondary主机提权升级为DRBD的主节点,并且VIP资源也会转移过来。 当Primary主机的故障恢复时,会再次变为DRBD的主节点,并重新夺回VIP资源。从而实现故障转移
Primary主机(11.11.11.2)默认作为DRBD的主节点,DRBD挂载目录是
/data
Secondary主机(11.11.11.3)是DRBD的备份节点
在Primary和Secondary两台主机上安装NFS [root@Primary ~]# yum install rpcbind nfs-utils [root@Primary ~]# vim /etc/exports /data 11.11.11.0/24(rw,sync,no_root_squash) service rpcbind start service nfs start
关闭两台主机的iptables防火墙 防火墙最好关闭,否则可能导致客户机挂载nfs时会失败! 若开启防火墙,需要在iptables中开放nfs相关端口机以及VRRP组播地址 [root@Primary ~]# /etc/init.d/iptables stop 两台机器上的selinux一定要关闭!!!!!!!!!! 否则下面在keepalived.conf里配置的notify_master.sh等脚本执行失败!这是曾经踩过的坑! [root@Primary ~]# setenforce 0 //临时关闭。永久关闭的话,还需要在/etc/sysconfig/selinux 文件里将SELINUX改为disabled [root@Primary ~]# getenforce Permissive
13、在两台主机上安装Keepalived,配合keepalived实现自动fail-over
13.1、Primary端配置
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
安装Keepalived [root@Primary ~]# yum install -y openssl-devel popt-devel [root@Primary ~]# cd /usr/local/src/ [root@Primary src]# wget http://www.keepalived.org/software/keepalived-1.3.5.tar.gz [root@Primary src]# tar -zvxf keepalived-1.3.5.tar.gz [root@Primary src]# cd keepalived-1.3.5 [root@Primary keepalived-1.3.5]# ./configure --prefix=/usr/local/keepalived [root@Primary keepalived-1.3.5]# make && make install [root@Primary keepalived-1.3.5]# cp /usr/local/src/keepalived-1.3.5/keepalived/etc/init.d/keepalived /etc/rc.d/init.d/ [root@Primary keepalived-1.3.5]# cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/ [root@Primary keepalived-1.3.5]# mkdir /etc/keepalived/ [root@Primary keepalived-1.3.5]# cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/ [root@Primary keepalived-1.3.5]# cp /usr/local/keepalived/sbin/keepalived /usr/sbin/ [root@Primary keepalived-1.3.5]# echo "/etc/init.d/keepalived start" >> /etc/rc.local [root@Primary keepalived-1.3.5]# chmod +x /etc/rc.d/init.d/keepalived #添加执行权限 [root@Primary keepalived-1.3.5]# chkconfig keepalived on #设置开机启动 [root@Primary keepalived-1.3.5]# service keepalived start #启动 [root@Primary keepalived-1.3.5]# service keepalived stop #关闭 [root@Primary keepalived-1.3.5]# service keepalived restart #重启
yum install -y openssl-devel popt-devel
yum install keepalived
-----------Primary主机的keepalived.conf配置 [root@Primary ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf-bak [root@primary ~]# cat /etc/keepalived/keepalived.conf global_defs { notification_email { root@localhost } notification_email_from keepalived@localhost smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id DRBD_HA_MASTER } vrrp_script chk_nfs { script "/etc/keepalived/check_nfs.sh" interval 5 } vrrp_instance VI_1 { state MASTER interface ens33 //注意修改成自己的网卡名称 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 } track_script { chk_nfs } virtual_ipaddress { 11.11.11.4 } nopreempt notify_stop "/etc/keepalived/notify_stop.sh" notify_master "/etc/keepalived/notify_master.sh" } 启动keepalived服务
[root@primary ~]# service keepalived start
Redirecting to /bin/systemctl start keepalived.service
[root@primary ~]# ps -ef|grep keepalived
root 7314 1 0 14:22 ? 00:00:00 /usr/sbin/keepalived -D
root 7315 7314 0 14:22 ? 00:00:00 /usr/sbin/keepalived -D
root 7316 7314 0 14:22 ? 00:00:00 /usr/sbin/keepalived -D
root 7431 7119 0 14:22 pts/1 00:00:00 grep --color=auto keepalived
[root@primary ~]#
查看VIP
1、primary的脚本内容
1)此脚本只在Primary机器上配置
[root@primary ~]# cat /etc/keepalived/check_nfs.sh /sbin/service nfs status &>/dev/null if [ $? -ne 0 ];then ###如果服务状态不正常,先尝试重启服务 /sbin/service nfs restart /sbin/service nfs status &>/dev/null if [ $? -ne 0 ];then ###若重启nfs服务后,仍不正常 ###卸载drbd设备 umount /dev/drbd0 ###将drbd主降级为备 drbdadm secondary r0 #关闭keepalived /sbin/service keepalived stop fi fi[root@Primary ~]
# chmod 755 /etc/keepalived/check_nfs.sh
2)此脚本只在Primary机器上配置 [root@Primary ~]# mkdir /etc/keepalived/logs [root@primary ~]# cat /etc/keepalived/notify_stop.sh #!/bin/bash time=`date "+%F %H:%M:%S"` echo -e "$time ------notify_stop------ " >> /etc/keepalived/logs/notify_stop.log /sbin/service nfs stop &>> /etc/keepalived/logs/notify_stop.log /bin/umount /dev/drbd0 &>> /etc/keepalived/logs/notify_stop.log /sbin/drbdadm secondary r0 &>> /etc/keepalived/logs/notify_stop.log echo -e " " >> /etc/keepalived/logs/notify_stop.log [root@primary ~]#[root@Primary ~]
# chmod 755 /etc/keepalived/notify_stop.sh
2、此脚本在两台机器上都要配置
[root@primary ~]# cat /etc/keepalived/notify_master.sh #!/bin/bash time=`date "+%F %H:%M:%S"` echo -e "$time ------notify_master------ " >> /etc/keepalived/logs/notify_master.log /sbin/drbdadm primary data1 &>> /etc/keepalived/logs/notify_master.log /bin/mount /dev/drbd0 /data &>> /etc/keepalived/logs/notify_master.log /sbin/service nfs restart &>> /etc/keepalived/logs/notify_master.log echo -e " " >> /etc/keepalived/logs/notify_master.log [root@primary ~]#
[root@Primary ~]
# chmod 755
/etc/keepalived/notify_master.sh
13.2、Secondary端配置
-----------Secondary主机的keepalived.conf配置 [root@Secondary ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf-bak [root@secondary ~]# cat /etc/keepalived/keepalived.conf global_defs { notification_email { root@localhost } notification_email_from keepalived@localhost smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id DRBD_HA_BACKUP } vrrp_instance VI_1 { state BACKUP interface ens33 //注意修改成自己的网卡名称 virtual_router_id 51 priority 90 advert_int 1 authentication { auth_type PASS auth_pass 1111 } nopreempt notify_master "/etc/keepalived/notify_master.sh"//
当此机器为keepalived的master角色时执行这个脚本
notify_backup "/etc/keepalived/notify_backup.sh"
//
当此机器为keepalived的backup角色时执行这个脚本
virtual_ipaddress { 11.11.11.4 } }
启动keepalived服务
[root@Primary ~]
# chmod 755
/etc/keepalived/notify_master.sh[root@Primary ~]
# chmod 755
/etc/keepalived/notify_backup.sh
[root@primary ~]# service keepalived start
Redirecting to /bin/systemctl start keepalived.service
1、此脚本只在Secondary机器上配置
[root@Secondary ~]# mkdir /etc/keepalived/logs [root@Secondary ~]# vim /etc/keepalived/notify_backup.sh #!/bin/bash time=`date "+%F %H:%M:%S"` echo -e "$time ------notify_backup------ " >> /etc/keepalived/logs/notify_backup.log /sbin/service nfs stop &>> /etc/keepalived/logs/notify_backup.log /bin/umount /dev/drbd0 &>> /etc/keepalived/logs/notify_backup.log /sbin/drbdadm secondary data1 &>> /etc/keepalived/logs/notify_backup.log echo -e " " >> /etc/keepalived/logs/notify_backup.log[root@Secondary ~]
# chmod 755 /etc/keepalived/notify_backup.sh
14、远程客户挂载NFS
客户端只需要安装rpcbind程序,并确认服务正常 [root@huanqiu ~]# yum install rpcbind nfs-utils [root@huanqiu ~]# /etc/init.d/rpcbind start linux端: umount /web/ mount -t nfs 11.11.11.4:/data /web windos2012端: mount \11.11.11.4data -o nolock,rsize=1024,wsize=1024,timeo=15 z: 优,切换中几乎不断
主从出现UpToDate/DUnknown 故障恢复 https://blog.csdn.net/kjsayn/article/details/52958282 windos修改注册表(挂载上去之后没有写权限) https://jingyan.baidu.com/article/c910274bfd6800cd361d2df3.html
15、前端nginx缓存
创建缓存空间 mkdir /cache nginx -t nginx -s reload
[root@localhost conf.d]# pwd /etc/nginx/conf.d [root@localhost conf.d]# cat web.conf upstream node { server 11.11.11.9:80; } proxy_cache_path /cache levels=1:2 keys_zone=cache:10m max_size=10g inactive=60m use_temp_path=off; server { listen 80; server_name www.test.com; index index.html; location / { proxy_pass http://node; proxy_cache cache; proxy_cache_valid 200 304 12h; proxy_cache_valid any 10m; add_header Nginx-Cache "$upstream_cache_status"; proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504; } } [root@localhost conf.d]#
关于nginx缓存问题,详情请查看另一个博文
15.2、清除缓存
rm删除已缓存的数据 rm -rf /cache/*
16、测试
1) 先关闭Primary主机上的keepalived服务。就会发现VIP资源已经转移到Secondary主机上了。 同时,Primary主机的nfs也会主动关闭,同时Secondary会升级为DRBD的主节点 [root@Primary ~]# /etc/init.d/keepalived stop Stopping keepalived: [ OK ] [root@Primary ~]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether fa:16:3e:35:d1:d6 brd ff:ff:ff:ff:ff:ff inet 192.168.1.151/24 brd 192.168.1.255 scope global eth0 inet6 fe80::f816:3eff:fe35:d1d6/64 scope link valid_lft forever preferred_lft forever 查看系统日志,也能看到VIP资源转移信息 [root@Primary ~]# tail -1000 /var/log/messages ........ May 25 11:50:03 localhost Keepalived_vrrp[30940]: Sending gratuitous ARP on eth0 for 192.168.1.200 May 25 11:50:03 localhost Keepalived_vrrp[30940]: Sending gratuitous ARP on eth0 for 192.168.1.200 May 25 11:50:03 localhost Keepalived_vrrp[30940]: Sending gratuitous ARP on eth0 for 192.168.1.200 May 25 11:50:03 localhost Keepalived_vrrp[30940]: Sending gratuitous ARP on eth0 for 192.168.1.200 May 25 11:58:51 localhost Keepalived[30937]: Stopping May 25 11:58:51 localhost Keepalived_vrrp[30940]: VRRP_Instance(VI_1) sent 0 priority May 25 11:58:51 localhost Keepalived_vrrp[30940]: VRRP_Instance(VI_1) removing protocol VIPs. [root@Primary ~]# ps -ef|grep nfs root 588 10364 0 12:13 pts/1 00:00:00 grep --color nfs [root@Primary ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 156G 36G 112G 25% / tmpfs 2.9G 0 2.9G 0% /dev/shm /dev/vda1 190M 98M 83M 55% /boot [root@Primary ~]# /etc/init.d/drbd status drbd driver loaded OK; device status: version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37 m:res cs ro ds p mounted fstype 0:r0 Connected Secondary/Secondary UpToDate/UpToDate C 登录到Secondary备份机器上,发现VIP资源已经转移过来 [root@Secondary ~]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether fa:16:3e:4c:7e:88 brd ff:ff:ff:ff:ff:ff inet 192.168.1.152/24 brd 192.168.1.255 scope global eth0 inet 192.168.1.200/32 scope global eth0 inet6 fe80::f816:3eff:fe4c:7e88/64 scope link valid_lft forever preferred_lft forever [root@Secondary ~]# tail -1000 /var/log/messages ........ May 25 11:58:53 localhost Keepalived_vrrp[17131]: Sending gratuitous ARP on eth0 for 192.168.1.200 May 25 11:58:53 localhost Keepalived_vrrp[17131]: Sending gratuitous ARP on eth0 for 192.168.1.200 May 25 11:58:53 localhost Keepalived_vrrp[17131]: Sending gratuitous ARP on eth0 for 192.168.1.200 May 25 11:58:53 localhost Keepalived_vrrp[17131]: Sending gratuitous ARP on eth0 for 192.168.1.200 May 25 11:58:58 localhost Keepalived_vrrp[17131]: Sending gratuitous ARP on eth0 for 192.168.1.200 May 25 11:58:58 localhost Keepalived_vrrp[17131]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.1.200 [root@Secondary ~]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether fa:16:3e:4c:7e:88 brd ff:ff:ff:ff:ff:ff inet 192.168.1.152/24 brd 192.168.1.255 scope global eth0 inet 192.168.1.200/32 scope global eth0 inet6 fe80::f816:3eff:fe4c:7e88/64 scope link valid_lft forever preferred_lft forever [root@Secondary ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 156G 13G 135G 9% / tmpfs 2.9G 0 2.9G 0% /dev/shm /dev/vda1 190M 89M 92M 50% /boot /dev/drbd0 9.8G 23M 9.2G 1% /data 当Primary机器的keepalived服务恢复启动后,VIP资源又会强制夺回来(可以查看/var/log/message系统日志) 并且Primary还会再次变为DRBD的主节点 2) 关闭Primary主机的nfs服务。根据监控脚本,会主动去启动nfs,只要当启动失败时,才会强制由DRBD的主节点降为备份节点,并关闭keepalived。 从而跟上面流程一样实现故障转移 结论: 在上面的主从故障切换过程中,对于客户端来说,挂载NFS不影响使用,只是会有一点的延迟。 这也验证了drbd提供的数据一致性功能(包括文件的打开和修改状态等),在客户端看来,真个切换过程就是"一次nfs重启"(主nfs停,备nfs启)。
Top:
1、说一下
"Split-Brain"
(脑裂)的情况:
假设把Primary主机的的eth0设备宕掉,然后直接在Secondary主机上进行提权升级为DRBD的主节点,并且mount挂载DRBD,这时会发现之前在Primary主机上写入的数据文件确实同步过来了。 接着再把Primary主机的eth0设备恢复,看看有没有自动恢复 主从关系。经过查看,发现DRBD检测出了Split-Brain的状况,也就是两个节点都处于standalone状态, 故障描述如下:Split-Brain detected,dropping connection! 这就是传说中的“脑裂”。 DRBD官方推荐的手动恢复方案: 1)Secondary主机上的操作 # drbdadm secondary r0 # drbdadm disconnect all # drbdadm --discard-my-data connect r0 //或者"drbdadm -- --discard-my-data connect r0" 2)Primary主机上的操作 # drbdadm disconnect all # drbdadm connect r0 # drbdsetup /dev/drbd0 primary