zoukankan      html  css  js  c++  java
  • ceph修复osd为down的情况

    尝试一、直接重新激活所有osd

    1、查看osd树

    root@ceph01:~# ceph osd tree
    ID WEIGHT  TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY 
    -1 0.29279 root default                                      
    -2 0.14639     host ceph01                                   
     0 0.14639         osd.0        up  1.00000          1.00000 
    -3 0.14639     host ceph02                                   
     1 0.14639         osd.1      down        0          1.00000 

    发现osd.1是down掉的。

    2、再次激活所有的osd(记住是所有的,不只是down掉这一个)

    下面命令当中的/dev/sdb1是每一个osd节点使用的实际存储硬盘或分区。

    ceph-deploy osd activate  ceph01:/dev/sdb1 ceph02:/dev/sdb1

    3、查看osd树和健康状态

    root@ceph01:~/my-cluster# ceph osd tree
    ID WEIGHT  TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY 
    -1 0.29279 root default                                      
    -2 0.14639     host ceph01                                   
     0 0.14639         osd.0        up  1.00000          1.00000 
    -3 0.14639     host ceph02                                   
     1 0.14639         osd.1        up  1.00000          1.00000 
    root@ceph01:~/my-cluster# 
    root@ceph01:~/my-cluster# ceph -s
        cluster ecacda71-af9f-46f9-a2a3-a35c9e51db9e
         health HEALTH_OK
         monmap e1: 1 mons at {ceph01=10.111.131.125:6789/0}
                election epoch 14, quorum 0 ceph01
         osdmap e150: 2 osds: 2 up, 2 in
                flags sortbitwise,require_jewel_osds
          pgmap v9284: 64 pgs, 1 pools, 17 bytes data, 3 objects
                10310 MB used, 289 GB / 299 GB avail
                      64 active+clean

    只有为 HEALTH_OK 才算是正常的。

    尝试二、修复down掉的osd

    该方法主要应用于某个osd物理损坏,导致激活不了

    1、查看osd树

    root@ceph01:~# ceph osd tree
    ID WEIGHT  TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY 
    -1 0.29279 root default                                      
    -2 0.14639     host ceph01                                   
     0 0.14639         osd.0        up  1.00000          1.00000 
    -3 0.14639     host ceph02                                   
     1 0.14639         osd.1      down        0          1.00000 

    发现osd.1是down掉的。

    2、将osd.1的状态设置为out

    root@ceph02:~# ceph osd out osd.1
    osd.1 is already out. 

    3、从集群中删除

    root@ceph02:~# ceph osd rm osd.1  
    removed osd.1

    4、从CRUSH中删除

    root@ceph02:~# ceph osd crush rm osd.1 
    removed item id 1 name 'osd.1' from crush map

    5、删除osd.1的认证信息

    root@ceph02:~# ceph auth del osd.1
    updated

    6、umount

    umount /dev/sdb1

    7、再次查看osd的集群状态

    root@ceph02:~# ceph osd tree
    ID WEIGHT  TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY 
    -1 0.14639 root default                                      
    -2 0.14639     host ceph01                                   
     0 0.14639         osd.0        up  1.00000          1.00000 
    -3       0     host ceph02    

    8、登录ceph-deploy节点

    root@ceph01:~# cd /root/my-cluster/
    root@ceph01:~/my-cluster# 

    9、初始化磁盘

    ceph-deploy --overwrite-conf osd  prepare ceph02:/dev/sdb1

    10、再次激活所有的osd(记住是所有的,不只是down掉这一个)

    ceph-deploy osd activate  ceph01:/dev/sdb1 ceph02:/dev/sdb1

    11、查看osd树和健康状态

    root@ceph01:~/my-cluster# ceph osd tree
    ID WEIGHT  TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY 
    -1 0.29279 root default                                      
    -2 0.14639     host ceph01                                   
     0 0.14639         osd.0        up  1.00000          1.00000 
    -3 0.14639     host ceph02                                   
     1 0.14639         osd.1        up  1.00000          1.00000 
    root@ceph01:~/my-cluster# 
    root@ceph01:~/my-cluster# ceph -s
        cluster ecacda71-af9f-46f9-a2a3-a35c9e51db9e
         health HEALTH_OK
         monmap e1: 1 mons at {ceph01=10.111.131.125:6789/0}
                election epoch 14, quorum 0 ceph01
         osdmap e150: 2 osds: 2 up, 2 in
                flags sortbitwise,require_jewel_osds
          pgmap v9284: 64 pgs, 1 pools, 17 bytes data, 3 objects
                10310 MB used, 289 GB / 299 GB avail
                      64 active+clean

    只有为 HEALTH_OK 才算是正常的。

  • 相关阅读:
    HTTP 缓存图解
    http协议构成整理
    HTTP2.0
    Event Loop
    斐波那契数列
    归并排序
    快速排序
    史上最全前端资源
    Js 将 Date 转化为指定格式的String
    vue-cli webpack全局引入jquery
  • 原文地址:https://www.cnblogs.com/boshen-hzb/p/6796604.html
Copyright © 2011-2022 走看看