一、安装配置DRBD:
DRBD1的IP地址为:192.168.2.203
DRBD2的IP地址为:192.168.2.204
安装内核包:
uname –a ##查看本系统的内核版本是多少
注意:安装的内核包要和所显示的系统内核版本一直,所以最好是安装系统光盘自带的内核包
mkdir /mnt/cdrom
mount /dev/cdrom /mnt/cdrom #挂载系统光盘到/mnt/cdrom下
cd /mnt/cdrom/Packages
rpm –ivh kernel-devel-2.6.32-504.el6.x86_64.rpm
rpm –ivh kernel-headers-2.6.32-504.el6.x86_64.rpm
yum install gcc gcc-c++ make glibc flex perl -y
两台服务器上都安装DRBD:
wget http://oss.linbit.com/drbd/8.4/drbd-8.4.4.tar.gz
tar -xzvf drbd-8.4.4.tar.gz
cd drbd-8.4.4
./configure --prefix=/usr/local/drbd --with-km
make KDIR=/usr/src/kernels/2.6.32-504.el6.x86_64/
make install
mkdir -p /usr/local/drbd/var/run
cp /usr/local/drbd/etc/rc.d/init.d/drbd /etc/rc.d/init.d/
chkconfig –add drbd
chkconfig drbd on
加载drbd模块
modprobe drbd
检查是否加载了drbd模块
lsmod|grep drbd
备注:遇到此种错误,直接yum install perl –y
参数配置:(drbd1,drbd2)
vi /usr/local/drbd/etc/drbd.conf
include "drbd.d/global_common.conf"; include "drbd.d/*.res"; resource r0{ on drbd1 { device /dev/drbd0; disk /dev/sdb1; address 192.168.2.203:7788; meta-disk internal; } on drbd2 { device /dev/drbd0; disk /dev/sdb1; address 192.168.2.204:7788; meta-disk internal; } } |
vi /usr/local/drbd/etc/drbd.d/global_common.conf
global { usage-count yes; } common {
startup { wfc-timeout 0; degr-wfc-timeout 120; } disk {
on-io-error detach; } syncer { rate 600K; } } |
mknod /dev/drbd0 b 147 0
drbdadm create-md r0
两服务器同时启动drbd服务:
service drbd start
此时查看两台服务器的drbd状态均是这样:
drbd driver loaded OK; device status: version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@drbd1, 2015-12-29 16:21:36 m:res cs ro ds p mounted fstype 0:r0 Connected Secondary /Secondary Inconsistent/ Inconsistent C |
备注:这里ro:Secondary/Secondary表示两台主机的状态都是备机状态,ds是磁盘状态,显示的状态内容为“Inconsistent不一致”,这是因为DRBD无法判断哪一方为主机,应以哪一方的磁盘数据作为标准。
在drbd1上配置为主节点:
drbdsetup /dev/drbd0 primary –force
查看两台服务器的drbd状态则为这样:
Drbd1:
drbd driver loaded OK; device status: version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@drbd1, 2015-12-29 16:21:36 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Secondary UpToDate/UpToDate C |
Drbd2:
drbd driver loaded OK; device status: version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@dbrd2, 2015-12-29 16:14:14 m:res cs ro ds p mounted fstype 0:r0 Connected Secondary/Primary UpToDate/UpToDate C |
ro在主从服务器上分别显示 Primary/Secondary和Secondary/Primary
ds显示UpToDate/UpToDate
表示主从配置成功。
挂载DRBD:(drbd1)
从刚才的状态上看到mounted和fstype参数为空,所以我们这步开始挂载DRBD到系统目录/drbd
mkfs.ext4 /dev/drbd0
mke2fs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 1310720 inodes, 5241029 blocks 262051 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=4294967296 160 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000 Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 36 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. |
mkdir /drbd
mount /dev/drbd0 /drbd
注:Secondary节点上不允许对DRBD设备进行任何操作,包括挂载;所有的读写操作只能在Primary节点上进行,只有当Primary节点挂掉时,Secondary节点才能提升为Primary节点,并自动挂载DRBD继续工作。
成功挂载后的DRBD状态:(drbd1)
drbd driver loaded OK; device status: version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@drbd1, 2015-12-29 16:21:36 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Secondary UpToDate/UpToDate C /drbd ext4 |
二、安装mysql:
yum install libaio cmake ncurses-devel -y
tar -xzvf mysql-5.6.13.tar.gz
cd mysql-5.6.13
/usr/sbin/groupadd mysql
/usr/sbin/useradd -g mysql mysql
ln -s /usr/local/mysql/lib/libmysqlclient.so.18.1.0 /usr/lib64/libmysqlclient.so.18
ln -s /usr/local/mysql/lib/libmysqlclient.so.18.1.0 /usr/lib/libmysqlclient.so.18
ln -s /usr/local/mysql/lib/libmysqlclient.so.18.1.0 /usr/local/lib64/libmysqlclient.so.18
ln -s /usr/local/mysql/lib/libmysqlclient.so.18.1.0 /usr/local/lib/libmysqlclient.so.18
/usr/bin/cmake .-DCMAKE_INSTALL_PREFIX=/usr/local/mysql -DWITH_INNOBASE_STORAGE_ENGINE=1 -DWITH_ARCHIVE_STORAGE_ENGINE=1 -DWITH_BLACKHOLE_STORAGE_ENGINE=1 -DWITH_FEDERATED_STORAGE_ENGINE=1 -DWITH_PARTITION_STORAGE_ENGINE=1 -DENABLED_LOCAL_INFILE=0 -DEXTRA_CHARSETS=all -DDEFAULT_CHARSET=utf8 -DDEFAULT_COLLATION=utf8_general_ci -DMYSQL_USER=mysql -DINSTALL_LAYOUT=STANDALONE -DENABLED_PROFILING=ON -DMYSQL_MAINTAINER_MODE=OFF -DWITH_DEBUG=OFF
make
make install
chown -R mysql:mysql /usr/local/mysql
chown -R mysql:mysql /drbd/data/mysql
cp support-files/my-default.cnf /etc/my.cnf
cp support-files/mysql.server /etc/init.d/mysqld
chmod 755 /etc/init.d/mysqld
vi /etc/my.cnf (两台服务器均增加以下两行)
datadir=/drbd/data/mysql basedir=/usr/local/mysql |
/usr/local/mysql/scripts/mysql_install_db --user=mysql --datadir=/drbd/data/mysql/ --basedir=/usr/local/mysql
ls /drbd/data/mysql/(从服务器不需要初始化数据库)
三、手动切换drbd的主从
卸载主服务器上的磁盘
umount /drbd
service drbd status(主服务器)
drbd driver loaded OK; device status: version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@drbd1, 2015-12-29 16:21:36 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Secondary UpToDate/UpToDate C |
drbdadm secondary all #将主手动改为从
service drbd status(主服务器)
drbd driver loaded OK; device status: version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@drbd1, 2015-12-29 16:21:36 m:res cs ro ds p mounted fstype 0:r0 Connected Secondary/Secondary UpToDate/UpToDate C |
drbdadm primary all #将从手动改为主
service drbd status(从服务器)
drbd driver loaded OK; device status: version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@dbrd2, 2015-12-29 16:14:14 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Secondary UpToDate/UpToDate C |
mount /dev/drbd0 /drbd #重新挂载磁盘
ls /drbd/ #mysql初始化数据已经存在了
注意:DRBD脑裂后的处理
当DRBD出现脑裂后,会导致drbd两边的磁盘数据不一致,在确定要作为从的节点上切换成secondary,并放弃该资源的数据:
drbdadm secondary r0
drbdadm -- --discard-my-data connect r0
在要作为primary的节点重新连接secondary(如果这个节点当前的连接状态为WFConnection的话,可以省略),使用如下命令连接:
drbdadm connect r0
三、配置Heartbeat:
yum install heartbeat –y
cp /usr/share/doc/heartbeat-3.0.4/ha.cf /etc/ha.d/ha.cf
cp /usr/share/doc/heartbeat-3.0.4/haresources /etc/ha.d/haresources
cp /usr/share/doc/heartbeat-3.0.4/authkeys /etc/ha.d/authkeys
(DRBD1)
配置ha.cf文件,添加以下配置:
echo ''>/etc/ha.d/ha.cf
vi /etc/ha.d/ha.cf
logfile /var/log/ha-log #指定heartbeat日志文件的位置 logfacility local0 keepalive 2 #心跳发送时间间隔 deadtime 15 #备用节点15s内没有检测到主机心跳,确认对方故障 warntime 5 #警告5次 initdead 30 #守护进程启动30s后,启动服务资源 ucast eth1 192.168.2.204 #指定对方网卡及IP地址 auto_failback off #当primary节点切换到sencondary节点之后,primary节点恢复正常,不进行切回操作,因为切换一次的成本很高 node drbd1 #定义两个节点的主机名,一行写一个 node drbd2 |
(DRBD2)
配置ha.cf文件,添加以下配置:
echo ''>/etc/ha.d/ha.cf
vi /etc/ha.d/ha.cf
logfile /var/log/ha-log #指定heartbeat日志文件的位置 logfacility local0 keepalive 2 #心跳发送时间间隔 deadtime 15 #备用节点15s内没有检测到主机心跳,确认对方故障 warntime 5 #警告5次 initdead 30 #守护进程启动30s后,启动服务资源 ucast eth1 192.168.2.203 #指定对方网卡及IP地址 auto_failback off #当primary节点切换到sencondary节点之后,primary节点恢复正常,不进行切回操作,因为切换一次的成本很高 node drbd1 #定义两个节点的主机名,一行写一个 node drbd2 |
编辑双机互联验证文件authkeys,添加以下内容:(drbd1,drbd2)
vi /etc/ha.d/authkeys
auth 1 1 crc |
给验证文件600权限:
chmod 600 /etc/ha.d/authkeys
编辑集群资源文件:(drbd1,drbd2)
vi /etc/ha.d/haresources
drbd1 192.168.2.200/24/eth1 drbddisk::r0 Filesystem::/dev/drbd0::/drbd::ext4 mysqld |
创建DRBD脚本文件drbddisk:(两台服务器)
注意:
此处是一个大坑,因为默认yum安装Heartbeat,不会在/etc/ha.d/resource.d/创建drbddisk脚本,估计是版本太新了吧。记得前两年都不会这样的。囧。而且也无法在安装后从本地其他路径找到该文件。此处也是因为启动Heartbeat后无法PING通虚IP,最后通过查看/var/log/ha-log日志,找到一行ERROR: Cannot locate resource script drbddisk,然后进而到/etc/ha.d/resource.d/路径下发现竟然没有drbddisk脚本,最后在google上找到该代码,创建该脚本,终于测试通过:
vi /etc/ha.d/resource.d/drbddisk
#!/bin/bash # # This script is inteded to be used as resource script by heartbeat # # Copright 2003-2008 LINBIT Information Technologies # Philipp Reisner, Lars Ellenberg # ### DEFAULTFILE="/etc/default/drbd" DRBDADM="/sbin/drbdadm" if [ -f $DEFAULTFILE ]; then . $DEFAULTFILE fi if [ "$#" -eq 2 ]; then RES="$1" CMD="$2" else RES="all" CMD="$1" fi ## EXIT CODES # since this is a "legacy heartbeat R1 resource agent" script, # exit codes actually do not matter that much as long as we conform to # http://wiki.linux-ha.org/HeartbeatResourceAgent # but it does not hurt to conform to lsb init-script exit codes, # where we can. # http://refspecs.linux-foundation.org/LSB_3.1.0/ #LSB-Core-generic/LSB-Core-generic/iniscrptact.html #### drbd_set_role_from_proc_drbd() { local out if ! test -e /proc/drbd; then ROLE="Unconfigured" return fi dev=$( $DRBDADM sh-dev $RES ) minor=${dev#/dev/drbd} if [[ $minor = *[!0-9]* ]] ; then # sh-minor is only supported since drbd 8.3.1 minor=$( $DRBDADM sh-minor $RES ) fi if [[ -z $minor ]] || [[ $minor = *[!0-9]* ]] ; then ROLE=Unknown return fi if out=$(sed -ne "/^ *$minor: cs:/ { s/:/ /g; p; q; }" /proc/drbd); then set -- $out ROLE=${5%/**} : ${ROLE:=Unconfigured} # if it does not show up else ROLE=Unknown fi } case "$CMD" in start) # try several times, in case heartbeat deadtime # was smaller than drbd ping time try=6 while true; do $DRBDADM primary $RES && break let "--try" || exit 1 # LSB generic error sleep 1 done ;; stop) # heartbeat (haresources mode) will retry failed stop # for a number of times in addition to this internal retry. try=3 while true; do $DRBDADM secondary $RES && break # We used to lie here, and pretend success for anything != 11, # to avoid the reboot on failed stop recovery for "simple # config errors" and such. But that is incorrect. # Don't lie to your cluster manager. # And don't do config errors... let --try || exit 1 # LSB generic error sleep 1 done ;; status) if [ "$RES" = "all" ]; then echo "A resource name is required for status inquiries." exit 10 fi ST=$( $DRBDADM role $RES ) ROLE=${ST%/**} case $ROLE in Primary|Secondary|Unconfigured) # expected ;; *) # unexpected. whatever... # If we are unsure about the state of a resource, we need to # report it as possibly running, so heartbeat can, after failed # stop, do a recovery by reboot. # drbdsetup may fail for obscure reasons, e.g. if /var/lock/ is # suddenly readonly. So we retry by parsing /proc/drbd. drbd_set_role_from_proc_drbd esac case $ROLE in Primary) echo "running (Primary)" exit 0 # LSB status "service is OK" ;; Secondary|Unconfigured) echo "stopped ($ROLE)" exit 3 # LSB status "service is not running" ;; *) # NOTE the "running" in below message. # this is a "heartbeat" resource script, # the exit code is _ignored_. echo "cannot determine status, may be running ($ROLE)" exit 4 # LSB status "service status is unknown" ;; esac ;; *) echo "Usage: drbddisk [resource] {start|stop|status}" exit 1 ;; esac exit 0 |
给文件drbddisk权限:
chmod 755 /etc/ha.d/resource.d/drbddisk
配置好heartbeat之后,需要将mysql从自启动服务器中去掉,因为主heartbeat启动的时候会挂载drdb文件系统以及启动mysql,切换的时候会将主上的mysql停止并卸载文件系统,从上会挂载文件系统,并启动mysql。因此需要做如下操作(两台服务器):
chkconfig mysqld off
chkconfig heartbeat off
chkconfig drbd off
vi /etc/rc.local
#!/bin/sh # # This script will be executed *after* all the other init scripts. # You can put your own initialization stuff in here if you don't # want to do the full Sys V style init stuff. touch /var/lock/subsys/local modprobe drbd #必须先加载模块,这也是因为将启动命令放在这里的原因 /etc/init.d/drbd start /etc/init.d/heartbeat start |
到这里heartbeat+drbd+mysql高可用环境就搭建结束了。接下来进行测试。
高可用性测试:
在第一台服务器上面启动mysql服务(drbd1)
/etc/init.d/mysqld start
Starting MySQL.... SUCCESS! |
/etc/init.d/drbd status
drbd driver loaded OK; device status: version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@drbd1, 2015-12-29 16:21:36 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Secondary UpToDate/UpToDate C /drbd ext4 |
在两台服务器上面启动heartbeat
[root@drbd1 ~]#/etc/init.d/heartbeat start
Starting High-Availability services: INFO: Resource is stopped Done. |
[root@drbd2 ~]#/etc/init.d/heartbeat start
[root@drbd1 ~]#ip addr | grep eth1
2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.2.203/24 brd 192.168.2.255 scope global eth1 inet 192.168.2.200/24 brd 192.168.2.255 scope global secondary eth1 |
测试方法:
1.停掉master上的mysqld,看看是否切换(因为heartheat不检查服务的可用性,因此需要通过而外的脚本来实现)。
2.停掉master的heartheat看看是否能正常切换。
3.停掉master的网络或者直接将master系统shutdown,看看能否正常切换。
4.启动master的heartbeat看看是否能正常切换回来。
5.重新启动master看看能否切换过程是否OK。
注意:这里说的切换是不是已经将mysql停掉、是否卸载了文件系统等等。
停止master(drbd1)上的heartbeat来测试是否会自动切换,这里除了第一条无法实现,其他的都可以切换:
[root@drbd1 ~]# /etc/init.d/heartbeat stop
Stopping High-Availability services: Done. |
[root@drbd1 ~]# df –HT
Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/VolGroup-lv_root ext4 7.0G 4.4G 2.3G 66% / tmpfs tmpfs 249M 0 249M 0% /dev/shm /dev/sda1 ext4 500M 29M 445M 6% /boot |
[root@drbd1 ~]# service drbd status
drbd driver loaded OK; device status: version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@drbd1, 2015-12-29 16:21:36 m:res cs ro ds p mounted fstype 0:r0 Connected Secondary/Primary UpToDate/UpToDate C |
查看drbd2从服务器上是否切换过来了:
[root@drbd2 ~]# service drbd status
drbd driver loaded OK; device status: version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@dbrd2, 2015-12-29 16:14:14 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Secondary UpToDate/UpToDate C /drbd ext4 |
[root@drbd2 ~]# df -HT
Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/VolGroup-lv_root ext4 7.0G 4.4G 2.3G 66% / tmpfs tmpfs 249M 0 249M 0% /dev/shm /dev/sda1 ext4 500M 29M 445M 6% /boot /dev/drbd0 ext4 21G 162M 20G 1% /drbd |
[root@drbd2 ~]# ip addr
eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:71:c8:7e brd ff:ff:ff:ff:ff:ff inet 192.168.2.204/24 brd 192.168.2.255 scope global eth1 inet 192.168.2.200/24 brd 192.168.2.255 scope global secondary eth1 inet6 fe80::20c:29ff:fe71:c87e/64 scope link valid_lft forever preferred_lft forever |