在Solaris系统实施上,经常利用IPMP进行双网卡绑定以提高网络的高可用性。IPMP有两种方式,分别是link-based和probe-based。其中link-based配置简单,使用广泛。link-based IPMP是否自动进行网卡切换主要是看Solaris内核中记录的link状态。在某些情况下,虽然网络已经出现问题,但solaris内核却认为link状态仍然是正常的。比如:与主机网卡连接的交换机端口因拥塞而hang死。此时网络已经不通了,但solaris仍然认为link状态是正常的。
通常情况下,使用kstat命令可以获得网卡的link状态。如:
root@jumpstart:/ #>kstat -p e1000g:0 | grep link_state
e1000g:0:mac:link_state 1
root@jumpstart:/ #>
root@jumpstart:/ #>kstat -m e1000g -i 0 -s link_state
module: e1000g instance: 0
name: mac class: net
link_state 1
对于大部分网卡而言,可以通过ifconfig命令查看网卡是否处于running状态来判断link状态是否正常。有running标志表示link状态正常,否则表示link断开。
root@jumpstart:/ #>ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 192.168.2.45 netmask ffffff00 broadcast 192.168.2.255
ether 0:c:29:81:de:26
以下是几种情况下link状态信息。测试环境的IPMP配置如下:
bash-3.00# more /etc/hostname.e1000g0
Oracle10g broadcast + group grp1 up
bash-3.00#
bash-3.00# more /etc/hostname.e1000g1
group grp1
1. 正常情况下
bash-3.00# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 192.168.9.30 netmask ffffff00 broadcast 192.168.9.255
groupname grp1
ether 0:c:29:80:b6:bd
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname grp1
ether 0:c:29:80:b6:c7
正常情况下,网卡e1000g0状态为UP RUNNING. IP地址加在网卡e1000g0上。
bash-3.00# kstat -p e1000g:0 | grep link
e1000g:0:mac:link_asmpause 1
e1000g:0:mac:link_autoneg 1
e1000g:0:mac:link_duplex 2
e1000g:0:mac:link_pause 1
e1000g:0:mac:link_state 1
e1000g:0:mac:link_up 1
e1000g:0:statistics:link_speed 1000
此时link_state和link_up均为1。
2. 使用ifconfig命令将网卡设置为down
bash-3.00# ifconfig e1000g0 down
bash-3.00# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g0: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 192.168.9.30 netmask ffffff00 broadcast 192.168.9.255
groupname grp1
ether 0:c:29:80:b6:bd
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname grp1
ether 0:c:29:80:b6:c7
此时网卡e1000g0的UP标志没有了,但RUNNING标志仍然存在,说明网卡link状态没有发生变化。同时主机IP地址还加在网卡e1000g0上,没有漂移到网卡e1000g1。此时该IP地址已经ping不通了。
bash-3.00# kstat -p e1000g:0 | grep link
e1000g:0:mac:link_asmpause 1
e1000g:0:mac:link_autoneg 1
e1000g:0:mac:link_duplex 2
e1000g:0:mac:link_pause 1
e1000g:0:mac:link_state 1
e1000g:0:mac:link_up 1
e1000g:0:statistics:link_speed 1000
bash-3.00#
此时网卡的link_state和link_up状态未发生变化(都是1),但ifconfig的输出中已经没有了UP标志。注意此处的link_up与ifconfig中UP标志的含义是不同的。
以下是ifconfig的帮助文件中关于UP和RUNNING标志的解释。
RUNNING
Indicates that the required resources for an interface are allocated. For some interfaces this also indicates that the link is up.
UP
Indicates that the interface is up, that is, all the routing entries and the like for this interface have been set up.
ifconfig命令中的UP标志,指的是interface的标志,而不是link的标志。而对于某些网卡而言RUNNING标志表示才表示link up.
3. unplumb网卡
先用up命令将e1000g0恢复。
bash-3.00# ifconfig e1000g0 up
bash-3.00# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 192.168.9.30 netmask ffffff00 broadcast 192.168.9.255
groupname grp1
ether 0:c:29:80:b6:bd
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname grp1
ether 0:c:29:80:b6:c7
bash-3.00# kstat -p e1000g:0 | grep link
e1000g:0:mac:link_asmpause 1
e1000g:0:mac:link_autoneg 1
e1000g:0:mac:link_duplex 2
e1000g:0:mac:link_pause 1
e1000g:0:mac:link_state 1
e1000g:0:mac:link_up 1
e1000g:0:statistics:link_speed 1000
网卡已经恢复到正常状态。
bash-3.00# ifconfig e1000g0 unplumb
bash-3.00# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname grp1
ether 0:c:29:80:b6:c7
执行unplumb后,ifconfig命令已经不在显示e1000g0. 但此时IP地址并没有漂移到e1000g1上。IP地址ping不通了。
bash-3.00# kstat -p e1000g:0 | grep link
e1000g:0:mac:link_asmpause 1
e1000g:0:mac:link_autoneg 1
e1000g:0:mac:link_duplex 2
e1000g:0:mac:link_pause 1
e1000g:0:mac:link_state 4294967295
e1000g:0:mac:link_up 0
e1000g:0:statistics:link_speed 1000
kstat中link_state变成4294967295, link_up变成0. 而此时dladm show-dev显示为link: unknown. 实际上dladm的link信息来自kstat输出。
4. plumb网卡
bash-3.00# ifconfig e1000g0 plumb
bash-3.00# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g0: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 5
inet 0.0.0.0 netmask 0
ether 0:c:29:80:b6:bd
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname grp1
ether 0:c:29:80:b6:c7
plumb网卡后, e1000g0状态显示为RUNNING, 但没有UP标志. IP地址也没有加到e1000g0上。
bash-3.00# kstat -p e1000g:0 | grep link
e1000g:0:mac:link_asmpause 1
e1000g:0:mac:link_autoneg 1
e1000g:0:mac:link_duplex 2
e1000g:0:mac:link_pause 1
e1000g:0:mac:link_state 1
e1000g:0:mac:link_up 1
e1000g:0:statistics:link_speed 1000
此时link_state和link_up都变为1,说明link状态恢复正常。
bash-3.00# svcadm restart physical
bash-3.00# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 5
inet 192.168.9.30 netmask ffffff00 broadcast 192.168.9.255
groupname grp1
ether 0:c:29:80:b6:bd
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname grp1
ether 0:c:29:80:b6:c7
bash-3.00# kstat -p e1000g:0 | grep link
e1000g:0:mac:link_asmpause 1
e1000g:0:mac:link_autoneg 1
e1000g:0:mac:link_duplex 2
e1000g:0:mac:link_pause 1
e1000g:0:mac:link_state 1
e1000g:0:mac:link_up 1
e1000g:0:statistics:link_speed 1000
重启网络后,e1000g0恢复正常。
5. 拔网线
拔网线前:
root@node1 # ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 10.0.2.71 netmask ff000000 broadcast 10.255.255.255
groupname grp1
ether 0:21:28:59:71:de
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname grp1
ether 0:21:28:59:71:df
root@node1 # kstat -p e1000g:0 | grep link
e1000g:0:mac:link_asmpause 1
e1000g:0:mac:link_autoneg 1
e1000g:0:mac:link_duplex 2
e1000g:0:mac:link_pause 1
e1000g:0:mac:link_state 1
e1000g:0:mac:link_up 1
e1000g:0:statistics:link_speed 100网卡e1000g0的link_state和link_up都是1。
网卡e1000g1的link_state和link_up都是1。
root@node1 # kstat -p e1000g:1 | grep link
e1000g:1:mac:link_asmpause 1
e1000g:1:mac:link_autoneg 1
e1000g:1:mac:link_duplex 2
e1000g:1:mac:link_pause 1
e1000g:1:mac:link_state 1
e1000g:1:mac:link_up 1
e1000g:1:statistics:link_speed 100
拔掉网卡e1000g0的网线之后:
root@node1 # ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g0: flags=19000802<BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED> mtu 0 index 2
inet 0.0.0.0 netmask 0
groupname grp1
ether 0:21:28:59:71:de
e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname grp1
ether 0:21:28:59:71:df
e1000g1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 10.0.2.71 netmask ff000000 broadcast 10.255.255.255拔掉网卡e1000g0的网线之后,e1000g0的状态变为NOFAILOVER,FAILED.
root@node1 # kstat -p e1000g:0 | grep link
e1000g:0:mac:link_asmpause 1
e1000g:0:mac:link_autoneg 1
e1000g:0:mac:link_duplex 1
e1000g:0:mac:link_pause 1
e1000g:0:mac:link_state 0
e1000g:0:mac:link_up 0e1000g:0:statistics:link_speed 0
e1000g的link_state和link_up状态都变为0.
root@node1 # kstat -p e1000g:1 | grep link
e1000g:1:mac:link_asmpause 1
e1000g:1:mac:link_autoneg 1
e1000g:1:mac:link_duplex 2
e1000g:1:mac:link_pause 1
e1000g:1:mac:link_state 1
e1000g:1:mac:link_up 1
e1000g:1:statistics:link_speed 100
插入网线后,恢复到拔线之前状态。