ceph health detail
HEALTH_WARN 14 requests are blocked > 32 sec; 11 osds have slow requests
7 ops are blocked > 536871 sec
2 ops are blocked > 268435 sec
2 ops are blocked > 67108.9 sec
3 ops are blocked > 33554.4 sec
1 ops are blocked > 536871 sec on osd.0
1 ops are blocked > 536871 sec on osd.10
2 ops are blocked > 536871 sec on osd.12
2 ops are blocked > 268435 sec on osd.18
1 ops are blocked > 536871 sec on osd.31
1 ops are blocked > 536871 sec on osd.38
1 ops are blocked > 67108.9 sec on osd.38
1 ops are blocked > 33554.4 sec on osd.48
1 ops are blocked > 67108.9 sec on osd.52
1 ops are blocked > 536871 sec on osd.63
1 ops are blocked > 33554.4 sec on osd.64
1 ops are blocked > 33554.4 sec on osd.69
11 osds have slow requests
此时注意看osd日志:
2019-04-26 13:28:44.132802 7f0d86e78700 -1 osd.24 4387 heartbeat_check: no reply from 0x7f0dcc3a5d10 osd.47 since back 2019-04-26 13:28:23.809860 front 2019-04-26 13:28:23.809860 (cutoff 2019-04-26 13:28:24.132797)
2019-04-26 13:28:44.395692 7f0da1b57700 -1 osd.24 4387 heartbeat_check: no reply from 0x7f0dcc3a5d10 osd.47 since back 2019-04-26 13:28:23.809860 front 2019-04-26 13:28:23.809860 (cutoff 2019-04-26 13:28:24.395682)
2019-04-26 13:28:44.635843 7f0d86e78700 -1 osd.24 4387 heartbeat_check: no reply from 0x7f0dcc3a5d10 osd.47 since back 2019-04-26 13:28:23.809860 front 2019-04-26 13:28:23.809860 (cutoff 2019-04-26 13:28:24.635838)
2019-04-26 13:28:45.139035 7f0d86e78700 -1 osd.24 4387 heartbeat_check: no reply from 0x7f0dcc3a5d10 osd.47 since back 2019-04-26 13:28:23.809860 front 2019-04-26 13:28:23.809860 (cutoff 2019-04-26 13:28:25.139031)
2019-04-26 13:28:45.395860 7f0da1b57700 -1 osd.24 4387 heartbeat_check: no reply from 0x7f0dcc3a5d10 osd.47 since back 2019-04-26 13:28:23.809860 front 2019-04-26 13:28:23.809860 (cutoff 2019-04-26 13:28:25.395849)
2019-04-26 13:28:45.641855 7f0d86e78700 -1 osd.24 4387 heartbeat_check: no reply from 0x7f0dcc3a5d10 osd.47 since back 2019-04-26 13:28:23.809860 front 2019-04-26 13:28:23.809860 (cutoff 2019-04-26 13:28:25.641850)
说明OSD47网络有问题:
可以先尝试重启服务
systemctl restart ceph-osd@*.service
如果不行重启机器
[root@controller01 ~]# ceph -s
cluster 30329309-3bff-470b-981f-5be63facde20
health HEALTH_OK
monmap e1: 3 mons at {node4=10.64.43.4:6789/0,node5=10.64.43.5:6789/0,node6=10.64.43.6:6789/0}
election epoch 5612, quorum 0,1,2 node4,node5,node6
fsmap e79: 1/1/1 up {0=node6=up:active}, 2 up:standby
osdmap e4529: 72 osds: 72 up, 72 in
flags sortbitwise,require_jewel_osds
pgmap v66545373: 3090 pgs, 5 pools, 3893 GB data, 948 kobjects
11578 GB used, 250 TB / 261 TB avail
3090 active+clean
client io 0 B/s rd, 9168 B/s wr, 1 op/s rd, 2 op/s wr