keepalived作用
在大型网站建设中,我们通常会用一些负载均衡技术(LVS、Nginx、Haproxy),将请求分发到后端的服务集群中。
这时,负载均衡均衡节点就成为单点故障的节点,为了保证系统的高可用,可以引入keepalived,将多个负载均衡节点联合起来作为一个整体对外服务,从而防止单点故障。
keepalived工作原理
keepalived是以VRRP协议为基础实现的,VRRP全称Virtual Router Redundancy Protocol,是一种路由容错协议,也可以叫做备份路由协议。一组VRRP路由器协同工作,共同构成一台虚拟路由器。该虚拟路由器对外表现为一个具有MAC地址和唯一固定IP地址的逻辑路由器。实际工作时,这一组路由器中的一个作为master节点,其余作为backup节点。工作期间,master会定时发送VRRP协议包,当master节点出现故障,backup节点收不到VRRP包,其中一个backup节点(priority大的)会自动成为master节点,继续提供服务,此过程完全自动,不需要人工干预。
keepalived利用这个原理,将不同的服务器节点(负载均衡节点、web服务器)组件成一个服务器组,对外提供统一的(VIP)地址,从而保证系统高可用。
keepalived安装
$ mkdir -p /opt/k8s/keepalive-haproxy
$ cd /opt/k8s/keepalive-haproxy
$ wget https://www.keepalived.org/software/keepalived-2.0.20.tar.gz
$ tar -xzvf keepalived-2.0.20.tar.gz
$ cd /opt/k8s/keepalive-haproxy/keepalived-2.0.20
# 执行configure,prefix指定安装目录
$ mkdir -p /opt/k8s/keepalived/
$ ./configure --prefix=/opt/k8s/keepalived/
$ make & make install
# 查看安装目录
$ ls /opt/k8s/keepalived/
bin etc sbin share
简单例子
下面来以keepalived结合tomcat来实现一个web服务器的双机热备。网络topology如下:
- 整个系统两个节点, 节点web1 192.168.0.114 (主节点), 节点web2 192.168.0.107(备用节点),虚拟IP(对外提供服务的IP 192.168.0.100)
- 两个节点上面都运行着一个同样的web服务,暴露在8086端口,在两个节点上都安装keepalived,外部通过虚拟IP访问系统
- 在实际工作过程中虚拟IP在某时刻只能属于某一个节点,另一个节点作为备用节点存在
- 当主节点正常的时候:节点web1上的keepalived广播出去的信息中,192.168.0.100 这个IP对应的MAC地址为节点web1网卡的MAC地址
- 同一网段内的机器会更新自己的ARP表,对应192.168.0.100的MAC地址=节点web1网卡的MAC地址
- 当节点web1发生故障的时候,节点web2上的keepalived会检测到,并且将下面的信息广播出去:
192.168.8.100 这个IP对应的MAC地址为节点web2网卡的MAC地址 - 同一网段内的其它电脑如客户端会更新自己的ARP表,对应192.168.8.100的MAC地址=节点web2网卡的MAC地址
-
配置主节点的keepalived
$ mkdir /opt/k8s/keepalived/conf $ cd /opt/k8s/keepalived/conf $ cat > keepalived.conf<<EOF global_defs { router_id web1 } vrrp_instance VI_1 { state MASTER #设置为主服务器 interface wlo1 #监测网络接口 virtual_router_id 55 #主、备必须一样 priority 200 #(主、备机取不同的优先级,主机值较大,备份机值较小,值越大优先级越高) advert_int 1 #VRRP Multicast广播周期秒数 authentication { auth_type PASS #VRRP认证方式,主备必须一致 auth_pass 1111 #(密码) } virtual_ipaddress { 192.168.0.100/24 #VRRP HA虚拟地址 } } EOF
-
配置备用节点的keepalived
$ mkdir /opt/k8s/keepalived/conf $ cd /opt/k8s/keepalived/conf $ cat > keepalived.conf<<EOF global_defs { router_id web2 } vrrp_instance VI_1 { state BACKUP #设置为主服务器 interface wlp3s0 #监测网络接口 virtual_router_id 55 #主、备必须一样 priority 100 #(主、备机取不同的优先级,主机值较大,备份机值较小,值越大优先级越高) advert_int 1 #VRRP Multicast广播周期秒数 authentication { auth_type PASS #VRRP认证方式,主备必须一致 auth_pass 1111 #(密码) } virtual_ipaddress { 192.168.0.100/24 #VRRP HA虚拟地址 } } EOF
-
在主备节点上启动keepalived服务
$ /opt/k8s/keepalived/sbin/keepalived -f /opt/k8s/keepalived/conf/keepalived.conf
-
查看日志
主节点(ubuntu系统,在syslog中)
$ tail -f /var/log/syslog | grep Keepalived Mar 13 17:21:42 slave Keepalived[11843]: Starting Keepalived v2.0.20 (01/22,2020) Mar 13 17:21:42 slave Keepalived[11843]: Running on Linux 5.3.0-28-generic #30~18.04.1-Ubuntu SMP Fri Jan 17 06:14:09 UTC 2020 (built for Linux 4.15.18) Mar 13 17:21:42 slave Keepalived[11843]: Command line: '/opt/k8s/keepalived/sbin/keepalived' '-f' '/opt/k8s/keepalived/conf/keepalived.conf' Mar 13 17:21:42 slave Keepalived[11843]: Opening file '/opt/k8s/keepalived/conf/keepalived.conf'. Mar 13 17:21:42 slave Keepalived[11843]: Remove a zombie pid file /run/keepalived.pid Mar 13 17:21:42 slave Keepalived[11844]: Starting VRRP child process, pid=11845 Mar 13 17:21:42 slave Keepalived_vrrp[11845]: Registering Kernel netlink reflector Mar 13 17:21:42 slave Keepalived_vrrp[11845]: Registering Kernel netlink command channel Mar 13 17:21:42 slave Keepalived_vrrp[11845]: Opening file '/opt/k8s/keepalived/conf/keepalived.conf'. Mar 13 17:21:42 slave Keepalived_vrrp[11845]: Registering gratuitous ARP shared channel Mar 13 17:21:42 slave Keepalived_vrrp[11845]: (VI_1) Entering BACKUP STATE (init) Mar 13 17:21:45 slave Keepalived_vrrp[11845]: (VI_1) Entering MASTER STATE
从节点
$ tail -f /var/log/syslog | grep Keepalived Mar 13 17:24:30 master Keepalived[3118]: Starting Keepalived v2.0.20 (01/22,2020) Mar 13 17:24:30 master Keepalived[3118]: Running on Linux 5.3.0-40-generic #32~18.04.1-Ubuntu SMP Mon Feb 3 14:05:59 UTC 2020 (built for Linux 4.15.18) Mar 13 17:24:30 master Keepalived[3118]: Command line: '/opt/k8s/keepalived/sbin/keepalived' '-f' '/opt/k8s/keepalived/conf/keepalived.conf' Mar 13 17:24:30 master Keepalived[3118]: Opening file '/opt/k8s/keepalived/conf/keepalived.conf'. Mar 13 17:24:30 master Keepalived[3118]: Remove a zombie pid file /run/keepalived.pid Mar 13 17:24:30 master Keepalived[3120]: Starting VRRP child process, pid=3122 Mar 13 17:24:30 master Keepalived_vrrp[3122]: Registering Kernel netlink reflector Mar 13 17:24:30 master Keepalived_vrrp[3122]: Registering Kernel netlink command channel Mar 13 17:24:30 master Keepalived_vrrp[3122]: Opening file '/opt/k8s/keepalived/conf/keepalived.conf'. Mar 13 17:24:30 master Keepalived_vrrp[3122]: Registering gratuitous ARP shared channel Mar 13 17:24:30 master Keepalived_vrrp[3122]: (VI_1) Entering BACKUP STATE (init)
-
查看主节点网路设备
$ ip addr show wlo1 3: wlo1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether c4:8e:8f:d0:33:89 brd ff:ff:ff:ff:ff:ff inet 192.168.0.114/24 brd 192.168.0.255 scope global dynamic noprefixroute wlo1 valid_lft 4948sec preferred_lft 4948sec inet 192.168.0.100/24 scope global secondary wlo1 valid_lft forever preferred_lft forever inet6 fe80::db1e:449b:5431:7392/64 scope link noprefixroute valid_lft forever preferred_lft forever
- 可以看到在网卡wlo1下面的多了一个地址192.168.0.100/24
-
部署web服务
基于spring boot构建了一个应用,利用embeded-tomcat,里面提供了一个restful接口,对应暴露的端口设置成8086
@RequestMapping("/health") public String health(HttpServletRequest request) { return "OK"; }
在主备节点上都启动spring boot应用
-
通过虚拟IP访问
$ curl http://192.168.0.100:8086/health OK
模拟故障以及恢复
-
停掉主节点的keepalived进程
$ ps -elf | grep -e keepa -e PID F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD 1 S root 11844 1 0 80 0 - 4188 ep_pol 17:21 ? 00:00:00 /opt/k8s/keepalived/sbin/keepalived -f /opt/k8s/keepalived/conf/keepalived.conf 5 S root 11845 11844 0 80 0 - 4188 ep_pol 17:21 ? 00:00:00 /opt/k8s/keepalived/sbin/keepalived -f /opt/k8s/keepalived/conf/keepalived.conf 0 S root 19389 16663 0 80 0 - 5383 pipe_w 18:06 pts/0 00:00:00 grep --color=auto -e keepa -e PID $ kill -9 11844
-
查看主从节点网路设备
主节点
$ ip addr show wlo1 3: wlo1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether c4:8e:8f:d0:33:89 brd ff:ff:ff:ff:ff:ff inet 192.168.0.114/24 brd 192.168.0.255 scope global dynamic noprefixroute wlo1 valid_lft 5707sec preferred_lft 5707sec inet6 fe80::db1e:449b:5431:7392/64 scope link noprefixroute valid_lft forever preferred_lft forever
从节点
$ ip addr show wlp3s0 3: wlp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d0:c5:d3:57:73:01 brd ff:ff:ff:ff:ff:ff inet 192.168.0.107/24 brd 192.168.0.255 scope global dynamic noprefixroute wlp3s0 valid_lft 4597sec preferred_lft 4597sec inet 192.168.0.100/24 scope global secondary wlp3s0 valid_lft forever preferred_lft forever inet6 fe80::1fda:e90a:207a:67e4/64 scope link noprefixroute valid_lft forever preferred_lft forever
- 可以看到虚拟IP漂移到了从节点的网卡上了
-
通过虚拟IP访问
$ curl http://192.168.0.100:8086/health OK
- 服务正常响应
-
重新启动web1上的keepalived进程,再次查看主节点上的wlo1网卡,会发现虚拟IP又自动漂移回来
脚本检测
上一小节我们通过模拟keepalived进程down掉,验证了VIP自动迁移。可是当主节点上的keepalived功能正常,而只是web服务出现故障,按照之前的简单配置,VIP是不会自动迁移的,这时服务就变的不可访问了。解决方法是在keepalived的配置文件中追加vrrp_script
示例如下:
-
编写检测脚本
$ mkdir /opt/k8s/keepalived/script $ cd /opt/k8s/keepalived/script $ cat > checkproxy.sh<<EOF #!/bin/bash count = `ps aux | grep -v grep | grep haproxy | wc -l` if [ $count > 0 ]; then exit 0 else exit 1 fi
-
在配置文件中追加检测
... vrrp_script checkhaproxy { script "/home/checkproxy.sh" interval 3 weight -150 } vrrp_instance test { ... track_script { checkhaproxy } ... }
-
关于check中的weight
- vrrp_script 里的script返回值为0时认为检测成功,其它值都会当成检测失败;
- weight 为正时,脚本检测成功时此weight会加到priority上,检测失败时不加;
- 主失败: 主 priority < 从 priority + weight 时会切换。
- 主成功: 主 priority + weight > 从 priority + weight 时,主依然为主
- weight 为负时,脚本检测成功时此weight不影响priority,检测失败时priority – abs(weight)
- 主失败: 主 priority – abs(weight) < 从priority 时会切换主从
- 主成功: 主 priority > 从priority 主依然为主