zoukankan      html  css  js  c++  java
  • HEALTH_WARN too few PGs per OSD (21 < min 30)解决方法

    标签(空格分隔): ceph,ceph运维,pg


    集群环境:

    [root@node3 ~]# cat /etc/redhat-release 
    CentOS Linux release 7.3.1611 (Core) 
    [root@node3 ~]# ceph -v
    ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
    

    集群当前布局:

    [root@node3 ceph-6]# ceph osd tree
    ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
    -1       0.08844 root default                           
    -3       0.02948     host node1                         
     0   hdd 0.00980         osd.0      up  1.00000 1.00000 
     3   hdd 0.00980         osd.3      up  1.00000 1.00000 
    -5       0.02948     host node2                         
     1   hdd 0.00980         osd.1      up  1.00000 1.00000 
     4   hdd 0.00980         osd.4      up  1.00000 1.00000 
    -7       0.02948     host node3                         
     2   hdd 0.00980         osd.2      up  1.00000 1.00000 
     5   hdd 0.00980         osd.5      up  1.00000 1.00000 
    

    为每个主机再添加一个osd:

    为了重现too few pgs的错误,同时为了创建指定数据位置osd,下面创建bluestore的osd,数据存储在/dev/sdd1上。在每个主机上执行下面的步骤:

    第一步:创建bluestore类型的osd:

    [root@node2 ~]# ceph-disk prepare --bluestore /dev/sdd2 --block.db /dev/sdd1
    set_data_partition: incorrect partition UUID: cafecafe-9b03-4f30-b4c6-b4b80ceff106, expected ['4fbd7e29-9d25-41b8-afd0-5ec00ceff05d', '4fbd7e29-9d25-41b8-afd0-062c0ceff05d', '4fbd7e29-8ae0-4982-bf9d-5a8d867af560', '4fbd7e29-9d25-41b8-afd0-35865ceff05d']
    prepare_device: OSD will not be hot-swappable if block.db is not the same device as the osd data
    prepare_device: Block.db /dev/sdd1 was not prepared with ceph-disk. Symlinking directly.
    meta-data=/dev/sdd2              isize=2048   agcount=4, agsize=648895 blks
             =                       sectsz=512   attr=2, projid32bit=1
             =                       crc=1        finobt=0, sparse=0
    data     =                       bsize=4096   blocks=2595579, imaxpct=25
             =                       sunit=0      swidth=0 blks
    naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
    log      =internal log           bsize=4096   blocks=2560, version=2
             =                       sectsz=512   sunit=0 blks, lazy-count=1
    realtime =none                   extsz=4096   blocks=0, rtextents=0
    

    第二步:激活该osd:

    [root@node2 ~]# ceph-disk activate /dev/sdd2
    creating /var/lib/ceph/tmp/mnt.mR3qCJ/keyring
    added entity osd.8 auth auth(auid = 18446744073709551615 key=AQBNqOVZt/iUBBAArkrWrZi9N0zxhHhYfhanyw== with 0 caps)
    got monmap epoch 1
    Removed symlink /etc/systemd/system/ceph-osd.target.wants/ceph-osd@8.service.
    Created symlink from /etc/systemd/system/ceph-osd.target.wants/ceph-osd@8.service to /usr/lib/systemd/system/ceph-osd@.service.
    

    最后查看集群布局,发现共有9个osd:

    [root@node3 ~]# ceph osd tree
    ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
    -1       0.08844 root default                           
    -3       0.02948     host node1                         
     0   hdd 0.00980         osd.0      up  1.00000 1.00000 
     3   hdd 0.00980         osd.3      up  1.00000 1.00000 
     7   hdd 0.00989         osd.7      up  1.00000 1.00000 
    -5       0.02948     host node2                         
     1   hdd 0.00980         osd.1      up  1.00000 1.00000 
     4   hdd 0.00980         osd.4      up  1.00000 1.00000 
     8   hdd 0.00989         osd.8      up  1.00000 1.00000 
    -7       0.02948     host node3                         
     2   hdd 0.00980         osd.2      up  1.00000 1.00000 
     5   hdd 0.00980         osd.5      up  1.00000 1.00000 
     6   hdd 0.00989         osd.6      up  1.00000 1.00000 
    

    重现too few pgs错误:

    创建一个pg数较小的存储池:

    [root@node3 ~]# ceph osd pool create rbd 64 64 
    pool 'rbd' created
    [root@node3 ~]# rados lspools
    rbd
    [root@node3 ~]# ceph -s
      cluster:
        id:     b8b4aa68-d825-43e9-a60a-781c92fec20e
        health: HEALTH_WARN
                too few PGs per OSD (21 < min 30)
     
      services:
        mon: 1 daemons, quorum node1
        mgr: node1(active)
        osd: 9 osds: 9 up, 9 in
     
      data:
        pools:   1 pools, 64 pgs
        objects: 0 objects, 0 bytes
        usage:   9742 MB used, 82717 MB / 92459 MB avail
        pgs:     64 active+clean
    

    从上面可以看到,提示说每个osd上的pg数量小于最小的数目30个。pgs为64,因为是3副本的配置,所以当有9个osd的时候,每个osd上均分了64/9 *3=21个pgs,也就是出现了如上的错误 小于最小配置30个。

    集群这种状态如果进行数据的存储和操作,会发现集群卡死,无法响应io,同时会导致大面积的osd down。

    解决办法:修改默认pool rbd的pgs

    [root@node3 ~]# ceph osd pool set rbd pg_num 128
    set pool 1 pg_num to 128
    

    之后查看集群状态

    [root@node3 ~]# ceph -s
      cluster:
        id:     b8b4aa68-d825-43e9-a60a-781c92fec20e
        health: HEALTH_WARN
                Reduced data availability: 5 pgs inactive, 44 pgs peering
                Degraded data redundancy: 49 pgs unclean
                1 pools have pg_num > pgp_num
     
      services:
        mon: 1 daemons, quorum node1
        mgr: node1(active)
        osd: 9 osds: 9 up, 9 in
     
      data:
        pools:   1 pools, 128 pgs
        objects: 0 objects, 0 bytes
        usage:   9743 MB used, 82716 MB / 92459 MB avail
        pgs:     7.031% pgs unknown
                 38.281% pgs not active
                 70 active+clean
                 44 peering
                 9  unknown
                 5  activating
    

    可以看到还没ok,提示pg_num 大于 pgp_num,所以还需要修改pgp_num

    [root@node3 ~]# ceph osd pool set rbd pgp_num 128
    set pool 1 pgp_num to 128
    

    再次查看集群状态:

    [root@node3 ~]# ceph -s
      cluster:
        id:     b8b4aa68-d825-43e9-a60a-781c92fec20e
        health: HEALTH_OK
     
      services:
        mon: 1 daemons, quorum node1
        mgr: node1(active)
        osd: 9 osds: 9 up, 9 in
     
      data:
        pools:   1 pools, 128 pgs
        objects: 0 objects, 0 bytes
        usage:   9750 MB used, 82709 MB / 92459 MB avail
        pgs:     128 active+clean
    

    这里是简单的实验,pool上也没有数据,所以修改pg影响并不大,但是如果是生产环境,这时候再重新修改pg数,会对生产环境产生较大影响。因为pg数变了,就会导致整个集群的数据重新均衡和迁移,数据越大响应io的时间会越长。所以,最好在一开始就设置好pg数。

    参考资料:

    HEALTH_WARN too few PGs per OSD (16 < min 30)

  • 相关阅读:
    题目一:使用Java实现二维数组中的查找
    算法的时间复杂度
    爬虫----爬虫存储库
    爬虫----爬虫解析库Beautifulsoup模块
    问卷调查
    爬虫----爬虫请求库selenium
    爬虫----爬虫请求库requests
    爬虫----爬虫基本原理
    python金融与量化分析------Matplotlib(绘图和可视化)
    Python与金融量化分析----金融与量化投资
  • 原文地址:https://www.cnblogs.com/sisimi/p/7837403.html
Copyright © 2011-2022 走看看