zoukankan      html  css  js  c++  java
  • Ceph Jewel 10.2.3 环境部署

    Ceph 测试环境部署

    本文档内容概要

    • 测试环境ceph集群部署规划
    • 测试环境ceph集群部署过程及块设备使用流程
    • mon节点扩容及osd节点扩容方法
    • 常见问题及解决方法

    由于暂时没有用到对象存储,所以暂时没有配对象存储的网关。

    回答:为什么docker里用到ceph?

    环境里面每台机器挂载了个1T的数据盘,为了充分利用集群里所有数据磁盘的空间,使用ceph构建分布式环境,将数据盘联合到一起,看成一个盘。当然,这个主要是ceph的快存储功能。

    集群部署规划

    主机角色规划

    主机名 系统 内核版本 IP地址 角色 部署服务
    docker-rancher-server CentOS 7.1.1503 3.10.0-229 10.142.246.2 mon、osd
    docker-rancher-client1 CentOS 7.1.1503 3.10.0-229 10.142.246.3 mon、osd
    docker-rancher-client2 CentOS 7.1.1503 3.10.0-229 10.142.246.4 osd
    hub.chinatelecom.cn CentOS 7.1.1503 3.10.0-229 10.142.246.5 osd

    部署架构图

    集群基础环境准备

    基础环境是所有节点都需要做的,以下主要以docker-rancher-server为例做,其他三台雷同

    0. 检查系统版本信息

    四台机器都是一样的虚拟机,其中一台版本信息如下:

    [op@docker-rancher-server ~]$ cat /etc/redhat-release 
    CentOS Linux release 7.1.1503 (Core) 
    [op@docker-rancher-server ~]$ uname -r
    3.10.0-229.el7.x86_64
    

    1. 做域名解析

    [op@docker-rancher-server ~]$ cat /etc/hosts
    10.142.246.2  docker-rancher-server
    10.142.246.3  docker-rancher-client1
    10.142.246.4  docker-rancher-client2
    10.142.246.5  hub.chinatelecom.cn    hub
    

    2. 防火墙策略

    ceph默认使用的端口

    Ceph Monitors 之间默认使用 6789 端口通信, OSD 之间默认用 6800:7300 这个范围内的端口通信,CentOS7默认使用的是firewall作为防火墙,不过我们已经改为iptables,所以直接在iptables里开放对应端口。

    # 命令格式
    sudo iptables -A INPUT -i {iface} -p tcp -s {ip-address}/{netmask} --dport 6789 -j ACCEPT
    # 实战
    [op@docker-rancher-server ~]$ sudo iptables -A INPUT -i eth0 -p tcp -s 10.142.0.0/16 --dport 6789 -j ACCEPT
    [op@docker-rancher-server ~]$ sudo iptables -A INPUT -i eth0 -p tcp -s 10.142.0.0/16 --dport 6800:7300 -j ACCEPT
    # 验证
    [op@docker-rancher-server ~]$ sudo iptables -L
    [op@docker-rancher-server ~]$ sudo iptables -L
    Chain INPUT (policy ACCEPT)
    target     prot opt source               destination         
    ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:zabbix-agent
    ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:8514
    ACCEPT     udp  --  anywhere             anywhere             udp dpt:8514
    ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:shell
    ACCEPT     udp  --  anywhere             anywhere             udp dpt:ipsec-nat-t
    ACCEPT     udp  --  anywhere             anywhere             udp dpt:isakmp
    ACCEPT     tcp  --  10.142.0.0/16        anywhere             tcp dpt:smc-https
    ACCEPT     tcp  --  10.142.0.0/16        anywhere             tcp dpts:6800:7300
    # 保存当前策略
    [op@docker-rancher-server ~]$ sudo service iptables save
    iptables: Saving firewall rules to /etc/sysconfig/iptables:[  确定  ]
    

    可以看到有有两条规则开放了相应端口

    3. NTP时间同步

    选用docker-rancher-server作为NTP时间服务器基准,其它三台同步时间到docker-rancher-server上

    # 安装NTP服务,所有节点都需要
    [op@docker-rancher-server ~]$ sudo yum install ntp -y
    [op@docker-rancher-server ~]$ sudo vim /etc/ntp.conf 
    # 允许内网其他机器同步时间
    restrict 10.142.0.0 mask 255.255.0.0 nomodify notrap
    
    server 10.142.246.2
    
    # 外部时间服务器不可用时,以本地时间作为时间服务
    server  127.127.1.0     # local clock
    fudge   127.127.1.0 stratum 10
    
    # 另外把其他的server都注释掉
    
    # 启动服务
    [op@docker-rancher-server ~]$ sudo systemctl restart  ntpd.service
    
    # 等上几分钟,看到
    [op@docker-rancher-server ~]$ ntpstat
    synchronised to local net at stratum 11 
       time correct to within 7948 ms
       polling server every 64 s
    [op@docker-rancher-server ~]$ ntpq -p
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
     docker-rancher- .INIT.          16 u    -   64    0    0.000    0.000   0.000
    *LOCAL(0)        .LOCL.          10 l   19   64    3    0.000    0.000   0.000
    
    # 把配置文件分发到其他几个节点
    # 启动服务
    [op@docker-rancher-server ~]$ sudo systemctl restart  ntpd.service
    # 此处要等很久,可以先处理后面的,后期再来查看
    [op@docker-rancher-client1 ~]$ ntpstat
    synchronised to NTP server (10.142.246.2) at stratum 12 
       time correct to within 29 ms
       polling server every 1024 s
    [op@docker-rancher-client1 ~]$ ntpq -p
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    *docker-rancher- LOCAL(0)        11 u   51  512  377    1.700    1.735   1.302
     LOCAL(0)        .LOCL.          10 l 220m   64    0    0.000    0.000   0.000
    

    4. 导入epel、ceph源

    关于epel源,ceph源的制作,请看同步各种源一文。

    之前我已经把epel的源放进去了,检查配置一下

    [op@docker-rancher-server yum.repos.d]$ sudo vim /etc/yum.repos.d/epel.repo 
    [epel]
    name=Extra Packages for Enterprise Linux 7 - x86_64
    baseurl=http://10.142.78.40/epel/7/x86_64
    failovermethod=priority
    enabled=1
    gpgcheck=1
    gpgkey=http://10.142.78.40/epel/RPM-GPG-KEY-EPEL-7
    
    [epel-debuginfo]
    name=Extra Packages for Enterprise Linux 7 - x86_64 - Debug
    baseurl=http://10.142.78.40/epel/7/x86_64/debug
    failovermethod=priority
    enabled=0
    gpgkey=http://10.142.78.40/epel/RPM-GPG-KEY-EPEL-7
    gpgcheck=1
    priority=2
    
    # 验证
    [op@docker-rancher-server ~]$ yum repolist
    已加载插件:fastestmirror
    Loading mirror speeds from cached hostfile
    源标识    源名称                                            状态
    base      RHEL-7 - Base - http                               8,652
    epel      Extra Packages for Enterprise Linux 7 - x86_64    10,846
    updates   CentOS-7 - Updates                                 3,723
    repolist: 23,221
    

    Ceph的源也已经放在公司内部了,添加一下

    [op@docker-rancher-server ~]$ sudo vim /etc/yum.repos.d/ceph.repo
    [ceph]
    name=Ceph packages for x86_64
    baseurl=http://10.142.78.40/ceph/rpm-jewel/el7/x86_64
    enabled=1
    gpgcheck=1
    type=rpm-md
    gpgkey=http://10.142.78.40/ceph/keys/release.asc
    priority=1
    
    [ceph-noarch]
    name=Ceph noarch packages
    baseurl=http://10.142.78.40/ceph/rpm-jewel/el7/noarch
    enabled=1
    gpgcheck=1
    type=rpm-md
    gpgkey=http://10.142.78.40/ceph/keys/release.asc
    priority=1
    
    [ceph-source]
    name=Ceph source packages
    baseurl=http://10.142.78.40/ceph/rpm-jewel/el7/SRPMS
    enabled=1
    gpgcheck=1
    type=rpm-md
    gpgkey=http://10.142.78.40/ceph/keys/release.asc
    priority=1
    
    # 验证
    [op@docker-rancher-server ~]$ yum repolist
    已加载插件:fastestmirror
    Loading mirror speeds from cached hostfile
    源标识      源名称                                          状态
    base        RHEL-7 - Base - http                             8,652
    ceph        Ceph packages for x86_64                           231
    ceph-noarch Ceph noarch packages                                12
    ceph-source Ceph source packages                                 0
    epel        Extra Packages for Enterprise Linux 7 - x86_64  10,846
    updates     CentOS-7 - Updates                               3,723
    repolist: 23,464
    

    5. 创建ceph以外的用户

    默认公司的服务器有op用户,不需要再创建

    另外,一定要赋给sudo权限

    6. 节点直接无密钥访问

    此处比较简单,不再赘述

    另外,官网推荐配置一下 ~/.ssh/config 文件。,这样 ceph-deploy 就能用你所建的用户名登录 Ceph 节点了,而无需每次执行 ceph-deploy 都要指定 --username {username} 。这样做同时也简化了 ssh 和scp 的用法。

    [op@docker-rancher-server ~]$ vim ~/.ssh/config
    Host ceph-node1   # 相当于别名
       Hostname docker-rancher-server  # 实际主机名
       User op						# 实际连接时用户
    Host ceph-node2
       Hostname docker-rancher-client1
       User op
    Host ceph-node3
       Hostname docker-rancher-client2
       User op
    Host ceph-node4
       Hostname hub.chinatelecom.cn
       User op
       
    # 更改一下权限,一定要更改,否则不能用
    [op@docker-rancher-server ~]$ chmod 600 .ssh/*
    
    # 此处更改的意义在于,比如我使用root用户登录,配置一下config文件,可以使用root时用op连接
    

    7. 设置了 requiretty

    在 CentOS 和 RHEL 上执行 ceph-deploy 命令时可能会报错。如果你的 Ceph 节点默认设置了 requiretty ,执行 sudo visudo 禁用它,并找到 Defaults requiretty 选项,把它改为 Defaults:ceph !requiretty 或者直接注释掉,这样 ceph-deploy 就可以用之前创建的用户(创建部署 Ceph 的用户 )连接了。

    # 所有节点执行,直接注掉
    [op@docker-rancher-server ~]$ sudo vim /etc/sudoers
    # Defaults    requiretty
    

    8. 禁用selinux

    vim /etc/selinux/config
    SELINUX=disabled
    # 立即生效
    [op@ceph-node1 ~]$ sudo setenforce 0
    

    9. 安装ceph-deploy

    yum install ceph-deploy -y
    

    集群环境部署

    参考网站:ceph官网

    以下在admin上操作

    Important:如果你是用另一普通用户登录的,不要用 sudo 或在 root 身份运行 ceph-deploy ,因为它不会在远程主机上调用所需的 sudo 命令。

    1. 创建集群

    # 建立集群目录,未来一些配置文件会生成在这个目录下
    [op@docker-rancher-server ~]$ mkdir ceph && cd ceph
    # 创建monitor(至少一个)
    [op@ceph-node1 ceph]$ ceph-deploy new docker-rancher-server docker-rancher-client1
    
    # 验证是否产生配置文件
    [op@docker-rancher-server ceph]$ ls
    ceph.conf  ceph-deploy-ceph.log  ceph.mon.keyring
    
    # 修改默认配置文件
    vim ceph.conf
    osd pool default min sisz=2
    osd pool default size = 3
    
    # 如果有多块网卡,可以配置数据交互使用万兆网卡,测试暂时不具备相应条件
    # mon_clock_drift_allowed=5  # 单位是ms
    # osd_pool_default_crush_rule=0
    # osd_crush_chooseleaf_type=1
    # public network=10.10.0.0/24  # 公网IP地址
    # cluster network=192.168.0.0/24 # 内网IP地址
    

    2. 安装ceph

    [op@docker-rancher-server ceph]$ ceph-deploy install docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn
    ceph_deploy.conf][DEBUG ] found configuration file at: /usr/op/.cephdeploy.conf
    [ceph_deploy.cli][INFO  ] Invoked (1.5.36): /usr/bin/ceph-deploy install docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn
    [ceph_deploy.cli][INFO  ] ceph-deploy options:
    [ceph_deploy.cli][INFO  ]  verbose                       : False
    [ceph_deploy.cli][INFO  ]  testing                       : None
    [ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f87fd8a7d40>
    [ceph_deploy.cli][INFO  ]  cluster                       : ceph
    [ceph_deploy.cli][INFO  ]  dev_commit                    : None
    [ceph_deploy.cli][INFO  ]  install_mds                   : False
    [ceph_deploy.cli][INFO  ]  stable                        : None
    [ceph_deploy.cli][INFO  ]  default_release               : False
    [ceph_deploy.cli][INFO  ]  username                      : None
    [ceph_deploy.cli][INFO  ]  adjust_repos                  : True
    [ceph_deploy.cli][INFO  ]  func                          : <function install at 0x7f87fe79a578>
    [ceph_deploy.cli][INFO  ]  install_all                   : False
    [ceph_deploy.cli][INFO  ]  repo                          : False
    [ceph_deploy.cli][INFO  ]  host                          : ['docker-rancher-server', 'docker-rancher-client1', 'docker-rancher-client2', 'hub.chinatelecom.cn']
    [ceph_deploy.cli][INFO  ]  install_rgw                   : False
    [ceph_deploy.cli][INFO  ]  install_tests                 : False
    [ceph_deploy.cli][INFO  ]  repo_url                      : None
    [ceph_deploy.cli][INFO  ]  ceph_conf                     : None
    [ceph_deploy.cli][INFO  ]  install_osd                   : False
    [ceph_deploy.cli][INFO  ]  version_kind                  : stable
    [ceph_deploy.cli][INFO  ]  install_common                : False
    [ceph_deploy.cli][INFO  ]  overwrite_conf                : False
    [ceph_deploy.cli][INFO  ]  quiet                         : False
    [ceph_deploy.cli][INFO  ]  dev                           : master
    [ceph_deploy.cli][INFO  ]  nogpgcheck                    : False
    [ceph_deploy.cli][INFO  ]  local_mirror                  : None
    [ceph_deploy.cli][INFO  ]  release                       : None
    [ceph_deploy.cli][INFO  ]  install_mon                   : False
    [ceph_deploy.cli][INFO  ]  gpg_url                       : None
    [ceph_deploy.install][DEBUG ] Installing stable version jewel on cluster ceph hosts docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn
    [ceph_deploy.install][DEBUG ] Detecting platform for host docker-rancher-server ...
    [docker-rancher-server][DEBUG ] connection detected need for sudo
    [docker-rancher-server][DEBUG ] connected to host: docker-rancher-server 
    [docker-rancher-server][DEBUG ] detect platform information from remote host
    [docker-rancher-server][DEBUG ] detect machine type
    [ceph_deploy.install][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
    [docker-rancher-server][INFO  ] installing Ceph on docker-rancher-server
    ******
    [hub.chinatelecom.cn][INFO  ] Running command: sudo ceph --version
    [hub.chinatelecom.cn][DEBUG ] ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)
    

    初始化集群

    [op@docker-rancher-server ceph]$ ceph-deploy mon create-initial
    [ceph_deploy.conf][DEBUG ] found configuration file at: /usr/op/.cephdeploy.conf
    [ceph_deploy.cli][INFO  ] Invoked (1.5.36): /usr/bin/ceph-deploy mon create-initial
    [ceph_deploy.cli][INFO  ] ceph-deploy options:
    [ceph_deploy.cli][INFO  ]  username                      : None
    [ceph_deploy.cli][INFO  ]  verbose                       : False
    [ceph_deploy.cli][INFO  ]  overwrite_conf                : False
    [ceph_deploy.cli][INFO  ]  subcommand                    : create-initial
    [ceph_deploy.cli][INFO  ]  quiet                         : False
    [ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0xd5c710>
    [ceph_deploy.cli][INFO  ]  cluster                       : ceph
    [ceph_deploy.cli][INFO  ]  func                          : <function mon at 0xd541b8>
    [ceph_deploy.cli][INFO  ]  ceph_conf                     : None
    [ceph_deploy.cli][INFO  ]  default_release               : False
    [ceph_deploy.cli][INFO  ]  keyrings                      : None
    [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts docker-rancher-server docker-rancher-client1
    [ceph_deploy.mon][DEBUG ] detecting platform for host docker-rancher-server ...
    [docker-rancher-server][DEBUG ] connection detected need for sudo
    [docker-rancher-server][DEBUG ] connected to host: docker-rancher-server 
    [docker-rancher-server][DEBUG ] detect platform information from remote host
    [docker-rancher-server][DEBUG ] detect machine type
    [docker-rancher-server][DEBUG ] find the location of an executable
    [ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.1.1503 Core
    [docker-rancher-server][DEBUG ] determining if provided host has same hostname in remote
    [docker-rancher-server][DEBUG ] get remote short hostname
    [docker-rancher-server][DEBUG ] deploying mon to docker-rancher-server
    [docker-rancher-server][DEBUG ] get remote short hostname
    [docker-rancher-server][DEBUG ] remote hostname: docker-rancher-server
    [docker-rancher-server][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
    [docker-rancher-server][DEBUG ] create the mon path if it does not exist
    [docker-rancher-server][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-docker-rancher-server/done
    [docker-rancher-server][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-docker-rancher-server/done
    [docker-rancher-server][INFO  ] creating keyring file: /var/lib/ceph/tmp/ceph-docker-rancher-server.mon.keyring
    [docker-rancher-server][DEBUG ] create the monitor keyring file
    [docker-rancher-server][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs -i docker-rancher-server --keyring /var/lib/ceph/tmp/ceph-docker-rancher-server.mon.keyring --setuser 167 --setgroup 167
    [docker-rancher-server][DEBUG ] ceph-mon: mon.noname-a 10.142.246.2:6789/0 is local, renaming to mon.docker-rancher-server
    [docker-rancher-server][DEBUG ] ceph-mon: set fsid to ef81681c-ee15-412e-a752-2c3e87b9e369
    [docker-rancher-server][DEBUG ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-docker-rancher-server for mon.docker-rancher-server
    [docker-rancher-server][INFO  ] unlinking keyring file /var/lib/ceph/tmp/ceph-docker-rancher-server.mon.keyring
    [docker-rancher-server][DEBUG ] create a done file to avoid re-doing the mon deployment
    [docker-rancher-server][DEBUG ] create the init path if it does not exist
    [docker-rancher-server][INFO  ] Running command: sudo systemctl enable ceph.target
    [docker-rancher-server][INFO  ] Running command: sudo systemctl enable ceph-mon@docker-rancher-server
    [docker-rancher-server][WARNIN] Created symlink from /etc/systemd/system/ceph-mon.target.wants/ceph-mon@docker-rancher-server.service to /usr/lib/systemd/system/ceph-mon@.service.
    [docker-rancher-server][INFO  ] Running command: sudo systemctl start ceph-mon@docker-rancher-server
    [docker-rancher-server][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-server.asok mon_status
    [docker-rancher-server][DEBUG ] ********************************************************************************
    [docker-rancher-server][DEBUG ] status for monitor: mon.docker-rancher-server
    [docker-rancher-server][DEBUG ] {
    [docker-rancher-server][DEBUG ]   "election_epoch": 0, 
    [docker-rancher-server][DEBUG ]   "extra_probe_peers": [
    [docker-rancher-server][DEBUG ]     "10.142.246.3:6789/0"
    [docker-rancher-server][DEBUG ]   ], 
    [docker-rancher-server][DEBUG ]   "monmap": {
    [docker-rancher-server][DEBUG ]     "created": "2016-11-28 12:38:30.861132", 
    [docker-rancher-server][DEBUG ]     "epoch": 0, 
    [docker-rancher-server][DEBUG ]     "fsid": "ef81681c-ee15-412e-a752-2c3e87b9e369", 
    [docker-rancher-server][DEBUG ]     "modified": "2016-11-28 12:38:30.861132", 
    [docker-rancher-server][DEBUG ]     "mons": [
    [docker-rancher-server][DEBUG ]       {
    [docker-rancher-server][DEBUG ]         "addr": "10.142.246.2:6789/0", 
    [docker-rancher-server][DEBUG ]         "name": "docker-rancher-server", 
    [docker-rancher-server][DEBUG ]         "rank": 0
    [docker-rancher-server][DEBUG ]       }, 
    [docker-rancher-server][DEBUG ]       {
    [docker-rancher-server][DEBUG ]         "addr": "0.0.0.0:0/1", 
    [docker-rancher-server][DEBUG ]         "name": "docker-rancher-client1", 
    [docker-rancher-server][DEBUG ]         "rank": 1
    [docker-rancher-server][DEBUG ]       }
    [docker-rancher-server][DEBUG ]     ]
    [docker-rancher-server][DEBUG ]   }, 
    [docker-rancher-server][DEBUG ]   "name": "docker-rancher-server", 
    [docker-rancher-server][DEBUG ]   "outside_quorum": [
    [docker-rancher-server][DEBUG ]     "docker-rancher-server"
    [docker-rancher-server][DEBUG ]   ], 
    [docker-rancher-server][DEBUG ]   "quorum": [], 
    [docker-rancher-server][DEBUG ]   "rank": 0, 
    [docker-rancher-server][DEBUG ]   "state": "probing", 
    [docker-rancher-server][DEBUG ]   "sync_provider": []
    [docker-rancher-server][DEBUG ] }
    [docker-rancher-server][DEBUG ] ********************************************************************************
    [docker-rancher-server][INFO  ] monitor: mon.docker-rancher-server is running
    [docker-rancher-server][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-server.asok mon_status
    [ceph_deploy.mon][DEBUG ] detecting platform for host docker-rancher-client1 ...
    [docker-rancher-client1][DEBUG ] connection detected need for sudo
    [docker-rancher-client1][DEBUG ] connected to host: docker-rancher-client1 
    [docker-rancher-client1][DEBUG ] detect platform information from remote host
    [docker-rancher-client1][DEBUG ] detect machine type
    [docker-rancher-client1][DEBUG ] find the location of an executable
    [ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.1.1503 Core
    [docker-rancher-client1][DEBUG ] determining if provided host has same hostname in remote
    [docker-rancher-client1][DEBUG ] get remote short hostname
    [docker-rancher-client1][DEBUG ] deploying mon to docker-rancher-client1
    [docker-rancher-client1][DEBUG ] get remote short hostname
    [docker-rancher-client1][DEBUG ] remote hostname: docker-rancher-client1
    [docker-rancher-client1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
    [docker-rancher-client1][DEBUG ] create the mon path if it does not exist
    [docker-rancher-client1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-docker-rancher-client1/done
    [docker-rancher-client1][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-docker-rancher-client1/done
    [docker-rancher-client1][INFO  ] creating keyring file: /var/lib/ceph/tmp/ceph-docker-rancher-client1.mon.keyring
    [docker-rancher-client1][DEBUG ] create the monitor keyring file
    [docker-rancher-client1][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs -i docker-rancher-client1 --keyring /var/lib/ceph/tmp/ceph-docker-rancher-client1.mon.keyring --setuser 167 --setgroup 167
    [docker-rancher-client1][DEBUG ] ceph-mon: mon.noname-b 10.142.246.3:6789/0 is local, renaming to mon.docker-rancher-client1
    [docker-rancher-client1][DEBUG ] ceph-mon: set fsid to ef81681c-ee15-412e-a752-2c3e87b9e369
    [docker-rancher-client1][DEBUG ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-docker-rancher-client1 for mon.docker-rancher-client1
    [docker-rancher-client1][INFO  ] unlinking keyring file /var/lib/ceph/tmp/ceph-docker-rancher-client1.mon.keyring
    [docker-rancher-client1][DEBUG ] create a done file to avoid re-doing the mon deployment
    [docker-rancher-client1][DEBUG ] create the init path if it does not exist
    [docker-rancher-client1][INFO  ] Running command: sudo systemctl enable ceph.target
    [docker-rancher-client1][INFO  ] Running command: sudo systemctl enable ceph-mon@docker-rancher-client1
    [docker-rancher-client1][WARNIN] Created symlink from /etc/systemd/system/ceph-mon.target.wants/ceph-mon@docker-rancher-client1.service to /usr/lib/systemd/system/ceph-mon@.service.
    [docker-rancher-client1][INFO  ] Running command: sudo systemctl start ceph-mon@docker-rancher-client1
    [docker-rancher-client1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-client1.asok mon_status
    [docker-rancher-client1][DEBUG ] ********************************************************************************
    [docker-rancher-client1][DEBUG ] status for monitor: mon.docker-rancher-client1
    [docker-rancher-client1][DEBUG ] {
    [docker-rancher-client1][DEBUG ]   "election_epoch": 2, 
    [docker-rancher-client1][DEBUG ]   "extra_probe_peers": [
    [docker-rancher-client1][DEBUG ]     "10.142.246.2:6789/0"
    [docker-rancher-client1][DEBUG ]   ], 
    [docker-rancher-client1][DEBUG ]   "monmap": {
    [docker-rancher-client1][DEBUG ]     "created": "2016-11-28 12:38:30.861132", 
    [docker-rancher-client1][DEBUG ]     "epoch": 1, 
    [docker-rancher-client1][DEBUG ]     "fsid": "ef81681c-ee15-412e-a752-2c3e87b9e369", 
    [docker-rancher-client1][DEBUG ]     "modified": "2016-11-28 12:38:30.861132", 
    [docker-rancher-client1][DEBUG ]     "mons": [
    [docker-rancher-client1][DEBUG ]       {
    [docker-rancher-client1][DEBUG ]         "addr": "10.142.246.2:6789/0", 
    [docker-rancher-client1][DEBUG ]         "name": "docker-rancher-server", 
    [docker-rancher-client1][DEBUG ]         "rank": 0
    [docker-rancher-client1][DEBUG ]       }, 
    [docker-rancher-client1][DEBUG ]       {
    [docker-rancher-client1][DEBUG ]         "addr": "10.142.246.3:6789/0", 
    [docker-rancher-client1][DEBUG ]         "name": "docker-rancher-client1", 
    [docker-rancher-client1][DEBUG ]         "rank": 1
    [docker-rancher-client1][DEBUG ]       }
    [docker-rancher-client1][DEBUG ]     ]
    [docker-rancher-client1][DEBUG ]   }, 
    [docker-rancher-client1][DEBUG ]   "name": "docker-rancher-client1", 
    [docker-rancher-client1][DEBUG ]   "outside_quorum": [
    [docker-rancher-client1][DEBUG ]     "docker-rancher-client1"
    [docker-rancher-client1][DEBUG ]   ], 
    [docker-rancher-client1][DEBUG ]   "quorum": [], 
    [docker-rancher-client1][DEBUG ]   "rank": 1, 
    [docker-rancher-client1][DEBUG ]   "state": "probing", 
    [docker-rancher-client1][DEBUG ]   "sync_provider": []
    [docker-rancher-client1][DEBUG ] }
    [docker-rancher-client1][DEBUG ] ********************************************************************************
    [docker-rancher-client1][INFO  ] monitor: mon.docker-rancher-client1 is running
    [docker-rancher-client1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-client1.asok mon_status
    [ceph_deploy.mon][INFO  ] processing monitor mon.docker-rancher-server
    [docker-rancher-server][DEBUG ] connection detected need for sudo
    [docker-rancher-server][DEBUG ] connected to host: docker-rancher-server 
    [docker-rancher-server][DEBUG ] detect platform information from remote host
    [docker-rancher-server][DEBUG ] detect machine type
    [docker-rancher-server][DEBUG ] find the location of an executable
    [docker-rancher-server][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-server.asok mon_status
    [ceph_deploy.mon][INFO  ] mon.docker-rancher-server monitor has reached quorum!
    [ceph_deploy.mon][INFO  ] processing monitor mon.docker-rancher-client1
    [docker-rancher-client1][DEBUG ] connection detected need for sudo
    [docker-rancher-client1][DEBUG ] connected to host: docker-rancher-client1 
    [docker-rancher-client1][DEBUG ] detect platform information from remote host
    [docker-rancher-client1][DEBUG ] detect machine type
    [docker-rancher-client1][DEBUG ] find the location of an executable
    [docker-rancher-client1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-client1.asok mon_status
    [ceph_deploy.mon][INFO  ] mon.docker-rancher-client1 monitor has reached quorum!
    [ceph_deploy.mon][INFO  ] all initial monitors are running and have formed quorum
    [ceph_deploy.mon][INFO  ] Running gatherkeys...
    [ceph_deploy.gatherkeys][INFO  ] Storing keys in temp directory /tmp/tmpCsnUv3
    [docker-rancher-server][DEBUG ] connection detected need for sudo
    [docker-rancher-server][DEBUG ] connected to host: docker-rancher-server 
    [docker-rancher-server][DEBUG ] detect platform information from remote host
    [docker-rancher-server][DEBUG ] detect machine type
    [docker-rancher-server][DEBUG ] get remote short hostname
    [docker-rancher-server][DEBUG ] fetch remote file
    [docker-rancher-server][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.docker-rancher-server.asok mon_status
    [docker-rancher-server][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-docker-rancher-server/keyring auth get client.admin
    [docker-rancher-server][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-docker-rancher-server/keyring auth get client.bootstrap-mds
    [docker-rancher-server][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-docker-rancher-server/keyring auth get client.bootstrap-osd
    [docker-rancher-server][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-docker-rancher-server/keyring auth get client.bootstrap-rgw
    [ceph_deploy.gatherkeys][INFO  ] Storing ceph.client.admin.keyring
    [ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-mds.keyring
    [ceph_deploy.gatherkeys][INFO  ] keyring 'ceph.mon.keyring' already exists
    [ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-osd.keyring
    [ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-rgw.keyring
    [ceph_deploy.gatherkeys][INFO  ] Destroy temp directory /tmp/tmpCsnUv3
    

    验证,应该产生下面几个文件

    [op@docker-rancher-server ceph]$ ll
    总用量 472
    -rw------- 1 op op    113 11月 28 12:38 ceph.bootstrap-mds.keyring
    -rw------- 1 op op    113 11月 28 12:38 ceph.bootstrap-osd.keyring
    -rw------- 1 op op    113 11月 28 12:38 ceph.bootstrap-rgw.keyring
    -rw------- 1 op op    129 11月 28 12:38 ceph.client.admin.keyring
    -rw-rw-r-- 1 op op    302 11月 25 15:32 ceph.conf
    -rw-rw-r-- 1 op op 422039 11月 28 12:38 ceph-deploy-ceph.log
    -rw------- 1 op op     73 11月 25 15:31 ceph.mon.keyring
    

    3. 增加OSD

    由于测试环境的特殊性,本次安装暂时把一个目录(挂载的数据盘)作为OSD目录,未来生产环境要用磁盘来做。

    准备工作

    # 在所有节点创建目录
    [op@docker-rancher-server data]$  sudo mkdir -p /data/ceph
    # 更改权限,否则会报错
    [op@docker-rancher-server data]$  sudo chown -R ceph:ceph /data/ceph
    

    增加OSD

    [op@docker-rancher-server ceph]$ ceph-deploy osd prepare  docker-rancher-server:/data/ceph docker-rancher-client1:/data/ceph docker-rancher-client2:/data/ceph hub.chinatelecom.cn:/data/ceph
    [ceph_deploy.conf][DEBUG ] found configuration file at: /usr/op/.cephdeploy.conf
    [ceph_deploy.cli][INFO  ] Invoked (1.5.36): /usr/bin/ceph-deploy osd prepare docker-rancher-server:/data/ceph docker-rancher-client1:/data/ceph docker-rancher-client2:/data/ceph hub.chinatelecom.cn:/data/ceph
    [ceph_deploy.cli][INFO  ] ceph-deploy options:
    [ceph_deploy.cli][INFO  ]  username                      : None
    [ceph_deploy.cli][INFO  ]  disk                          : [('docker-rancher-server', '/data/ceph', None), ('docker-rancher-client1', '/data/ceph', None), ('docker-rancher-client2', '/data/ceph', None), ('hub.chinatelecom.cn', '/data/ceph', None)]
    [ceph_deploy.cli][INFO  ]  dmcrypt                       : False
    [ceph_deploy.cli][INFO  ]  verbose                       : False
    [ceph_deploy.cli][INFO  ]  bluestore                     : None
    [ceph_deploy.cli][INFO  ]  overwrite_conf                : False
    [ceph_deploy.cli][INFO  ]  subcommand                    : prepare
    [ceph_deploy.cli][INFO  ]  dmcrypt_key_dir               : /etc/ceph/dmcrypt-keys
    [ceph_deploy.cli][INFO  ]  quiet                         : False
    [ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x15e97a0>
    [ceph_deploy.cli][INFO  ]  cluster                       : ceph
    [ceph_deploy.cli][INFO  ]  fs_type                       : xfs
    [ceph_deploy.cli][INFO  ]  func                          : <function osd at 0x15dba28>
    [ceph_deploy.cli][INFO  ]  ceph_conf                     : None
    [ceph_deploy.cli][INFO  ]  default_release               : False
    [ceph_deploy.cli][INFO  ]  zap_disk                      : False
    [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks docker-rancher-server:/data/ceph: docker-rancher-client1:/data/ceph: docker-rancher-client2:/data/ceph: hub.chinatelecom.cn:/data/ceph:
    [docker-rancher-server][DEBUG ] connection detected need for sudo
    [docker-rancher-server][DEBUG ] connected to host: docker-rancher-server 
    [docker-rancher-server][DEBUG ] detect platform information from remote host
    [docker-rancher-server][DEBUG ] detect machine type
    [docker-rancher-server][DEBUG ] find the location of an executable
    [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
    [ceph_deploy.osd][DEBUG ] Deploying osd to docker-rancher-server
    [docker-rancher-server][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
    [ceph_deploy.osd][DEBUG ] Preparing host docker-rancher-server disk /data/ceph journal None activate False
    [docker-rancher-server][DEBUG ] find the location of an executable
    [docker-rancher-server][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /data/ceph
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
    [docker-rancher-server][WARNIN] populate_data_path: Preparing osd data dir /data/ceph
    [docker-rancher-server][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/ceph_fsid.68575.tmp
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/ceph_fsid.68575.tmp
    [docker-rancher-server][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/fsid.68575.tmp
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/fsid.68575.tmp
    [docker-rancher-server][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/magic.68575.tmp
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/magic.68575.tmp
    [docker-rancher-server][INFO  ] checking OSD status...
    [docker-rancher-server][DEBUG ] find the location of an executable
    [docker-rancher-server][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
    [ceph_deploy.osd][DEBUG ] Host docker-rancher-server is now ready for osd use.
    [docker-rancher-client1][DEBUG ] connection detected need for sudo
    [docker-rancher-client1][DEBUG ] connected to host: docker-rancher-client1 
    [docker-rancher-client1][DEBUG ] detect platform information from remote host
    [docker-rancher-client1][DEBUG ] detect machine type
    [docker-rancher-client1][DEBUG ] find the location of an executable
    [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
    [ceph_deploy.osd][DEBUG ] Deploying osd to docker-rancher-client1
    [docker-rancher-client1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
    [ceph_deploy.osd][DEBUG ] Preparing host docker-rancher-client1 disk /data/ceph journal None activate False
    [docker-rancher-client1][DEBUG ] find the location of an executable
    [docker-rancher-client1][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /data/ceph
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
    [docker-rancher-client1][WARNIN] populate_data_path: Preparing osd data dir /data/ceph
    [docker-rancher-client1][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/ceph_fsid.31263.tmp
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/ceph_fsid.31263.tmp
    [docker-rancher-client1][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/fsid.31263.tmp
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/fsid.31263.tmp
    [docker-rancher-client1][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/magic.31263.tmp
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/magic.31263.tmp
    [docker-rancher-client1][INFO  ] checking OSD status...
    [docker-rancher-client1][DEBUG ] find the location of an executable
    [docker-rancher-client1][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
    [ceph_deploy.osd][DEBUG ] Host docker-rancher-client1 is now ready for osd use.
    [docker-rancher-client2][DEBUG ] connection detected need for sudo
    [docker-rancher-client2][DEBUG ] connected to host: docker-rancher-client2 
    [docker-rancher-client2][DEBUG ] detect platform information from remote host
    [docker-rancher-client2][DEBUG ] detect machine type
    [docker-rancher-client2][DEBUG ] find the location of an executable
    [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
    [ceph_deploy.osd][DEBUG ] Deploying osd to docker-rancher-client2
    [docker-rancher-client2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
    [docker-rancher-client2][WARNIN] osd keyring does not exist yet, creating one
    [docker-rancher-client2][DEBUG ] create a keyring file
    [ceph_deploy.osd][DEBUG ] Preparing host docker-rancher-client2 disk /data/ceph journal None activate False
    [docker-rancher-client2][DEBUG ] find the location of an executable
    [docker-rancher-client2][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /data/ceph
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
    [docker-rancher-client2][WARNIN] populate_data_path: Preparing osd data dir /data/ceph
    [docker-rancher-client2][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/ceph_fsid.101240.tmp
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/ceph_fsid.101240.tmp
    [docker-rancher-client2][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/fsid.101240.tmp
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/fsid.101240.tmp
    [docker-rancher-client2][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/magic.101240.tmp
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/magic.101240.tmp
    [docker-rancher-client2][INFO  ] checking OSD status...
    [docker-rancher-client2][DEBUG ] find the location of an executable
    [docker-rancher-client2][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
    [ceph_deploy.osd][DEBUG ] Host docker-rancher-client2 is now ready for osd use.
    [hub.chinatelecom.cn][DEBUG ] connection detected need for sudo
    [hub.chinatelecom.cn][DEBUG ] connected to host: hub.chinatelecom.cn 
    [hub.chinatelecom.cn][DEBUG ] detect platform information from remote host
    [hub.chinatelecom.cn][DEBUG ] detect machine type
    [hub.chinatelecom.cn][DEBUG ] find the location of an executable
    [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
    [ceph_deploy.osd][DEBUG ] Deploying osd to hub.chinatelecom.cn
    [hub.chinatelecom.cn][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
    [hub.chinatelecom.cn][WARNIN] osd keyring does not exist yet, creating one
    [hub.chinatelecom.cn][DEBUG ] create a keyring file
    [ceph_deploy.osd][DEBUG ] Preparing host hub.chinatelecom.cn disk /data/ceph journal None activate False
    [hub.chinatelecom.cn][DEBUG ] find the location of an executable
    [hub.chinatelecom.cn][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /data/ceph
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
    [hub.chinatelecom.cn][WARNIN] populate_data_path: Preparing osd data dir /data/ceph
    [hub.chinatelecom.cn][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/ceph_fsid.31875.tmp
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/ceph_fsid.31875.tmp
    [hub.chinatelecom.cn][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/fsid.31875.tmp
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/fsid.31875.tmp
    [hub.chinatelecom.cn][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/magic.31875.tmp
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/magic.31875.tmp
    [hub.chinatelecom.cn][INFO  ] checking OSD status...
    [hub.chinatelecom.cn][DEBUG ] find the location of an executable
    [hub.chinatelecom.cn][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
    [ceph_deploy.osd][DEBUG ] Host hub.chinatelecom.cn is now ready for osd use.
    

    激活OSD

    [op@docker-rancher-server ceph]$  ceph-deploy osd activate  docker-rancher-server:/data/ceph docker-rancher-client1:/data/ceph docker-rancher-client2:/data/ceph hub.chinatelecom.cn:/data/ceph
    [ceph_deploy.conf][DEBUG ] found configuration file at: /usr/op/.cephdeploy.conf
    [ceph_deploy.cli][INFO  ] Invoked (1.5.36): /usr/bin/ceph-deploy osd activate docker-rancher-server:/data/ceph docker-rancher-client1:/data/ceph docker-rancher-client2:/data/ceph hub.chinatelecom.cn:/data/ceph
    [ceph_deploy.cli][INFO  ] ceph-deploy options:
    [ceph_deploy.cli][INFO  ]  username                      : None
    [ceph_deploy.cli][INFO  ]  verbose                       : False
    [ceph_deploy.cli][INFO  ]  overwrite_conf                : False
    [ceph_deploy.cli][INFO  ]  subcommand                    : activate
    [ceph_deploy.cli][INFO  ]  quiet                         : False
    [ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x28e87a0>
    [ceph_deploy.cli][INFO  ]  cluster                       : ceph
    [ceph_deploy.cli][INFO  ]  func                          : <function osd at 0x28daa28>
    [ceph_deploy.cli][INFO  ]  ceph_conf                     : None
    [ceph_deploy.cli][INFO  ]  default_release               : False
    [ceph_deploy.cli][INFO  ]  disk                          : [('docker-rancher-server', '/data/ceph', None), ('docker-rancher-client1', '/data/ceph', None), ('docker-rancher-client2', '/data/ceph', None), ('hub.chinatelecom.cn', '/data/ceph', None)]
    [ceph_deploy.osd][DEBUG ] Activating cluster ceph disks docker-rancher-server:/data/ceph: docker-rancher-client1:/data/ceph: docker-rancher-client2:/data/ceph: hub.chinatelecom.cn:/data/ceph:
    [docker-rancher-server][DEBUG ] connection detected need for sudo
    [docker-rancher-server][DEBUG ] connected to host: docker-rancher-server 
    [docker-rancher-server][DEBUG ] detect platform information from remote host
    [docker-rancher-server][DEBUG ] detect machine type
    [docker-rancher-server][DEBUG ] find the location of an executable
    [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
    [ceph_deploy.osd][DEBUG ] activating host docker-rancher-server disk /data/ceph
    [ceph_deploy.osd][DEBUG ] will use init type: systemd
    [docker-rancher-server][DEBUG ] find the location of an executable
    [docker-rancher-server][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /data/ceph
    [docker-rancher-server][WARNIN] main_activate: path = /data/ceph
    [docker-rancher-server][WARNIN] activate: Cluster uuid is ef81681c-ee15-412e-a752-2c3e87b9e369
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
    [docker-rancher-server][WARNIN] activate: Cluster name is ceph
    [docker-rancher-server][WARNIN] activate: OSD uuid is ad4397de-63cf-4d7d-84ce-947450b4780d
    [docker-rancher-server][WARNIN] allocate_osd_id: Allocating OSD id...
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd create --concise ad4397de-63cf-4d7d-84ce-947450b4780d
    [docker-rancher-server][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/whoami.69785.tmp
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/whoami.69785.tmp
    [docker-rancher-server][WARNIN] activate: OSD id is 0
    [docker-rancher-server][WARNIN] activate: Initializing OSD...
    [docker-rancher-server][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /data/ceph/activate.monmap
    [docker-rancher-server][WARNIN] got monmap epoch 1
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/timeout 300 ceph-osd --cluster ceph --mkfs --mkkey -i 0 --monmap /data/ceph/activate.monmap --osd-data /data/ceph --osd-journal /data/ceph/journal --osd-uuid ad4397de-63cf-4d7d-84ce-947450b4780d --keyring /data/ceph/keyring --setuser ceph --setgroup ceph
    [docker-rancher-server][WARNIN] activate: Marking with init system systemd
    [docker-rancher-server][WARNIN] activate: Authorizing OSD key...
    [docker-rancher-server][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring auth add osd.0 -i /data/ceph/keyring osd allow * mon allow profile osd
    [docker-rancher-server][WARNIN] added key for osd.0
    [docker-rancher-server][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/active.69785.tmp
    [docker-rancher-server][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/active.69785.tmp
    [docker-rancher-server][WARNIN] activate: ceph osd.0 data dir is ready at /data/ceph
    [docker-rancher-server][WARNIN] activate_dir: Creating symlink /var/lib/ceph/osd/ceph-0 -> /data/ceph
    [docker-rancher-server][WARNIN] start_daemon: Starting ceph osd.0...
    [docker-rancher-server][WARNIN] command_check_call: Running command: /usr/bin/systemctl enable ceph-osd@0
    [docker-rancher-server][WARNIN] Created symlink from /etc/systemd/system/ceph-osd.target.wants/ceph-osd@0.service to /usr/lib/systemd/system/ceph-osd@.service.
    [docker-rancher-server][WARNIN] command_check_call: Running command: /usr/bin/systemctl start ceph-osd@0
    [docker-rancher-server][INFO  ] checking OSD status...
    [docker-rancher-server][DEBUG ] find the location of an executable
    [docker-rancher-server][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
    [docker-rancher-server][WARNIN] there is 1 OSD down
    [docker-rancher-server][WARNIN] there is 1 OSD out
    [docker-rancher-server][INFO  ] Running command: sudo systemctl enable ceph.target
    [docker-rancher-client1][DEBUG ] connection detected need for sudo
    [docker-rancher-client1][DEBUG ] connected to host: docker-rancher-client1 
    [docker-rancher-client1][DEBUG ] detect platform information from remote host
    [docker-rancher-client1][DEBUG ] detect machine type
    [docker-rancher-client1][DEBUG ] find the location of an executable
    [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
    [ceph_deploy.osd][DEBUG ] activating host docker-rancher-client1 disk /data/ceph
    [ceph_deploy.osd][DEBUG ] will use init type: systemd
    [docker-rancher-client1][DEBUG ] find the location of an executable
    [docker-rancher-client1][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /data/ceph
    [docker-rancher-client1][WARNIN] main_activate: path = /data/ceph
    [docker-rancher-client1][WARNIN] activate: Cluster uuid is ef81681c-ee15-412e-a752-2c3e87b9e369
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
    [docker-rancher-client1][WARNIN] activate: Cluster name is ceph
    [docker-rancher-client1][WARNIN] activate: OSD uuid is d7160d58-ff8d-4779-a1b6-cb3a1f645c96
    [docker-rancher-client1][WARNIN] allocate_osd_id: Allocating OSD id...
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd create --concise d7160d58-ff8d-4779-a1b6-cb3a1f645c96
    [docker-rancher-client1][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/whoami.32435.tmp
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/whoami.32435.tmp
    [docker-rancher-client1][WARNIN] activate: OSD id is 1
    [docker-rancher-client1][WARNIN] activate: Initializing OSD...
    [docker-rancher-client1][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /data/ceph/activate.monmap
    [docker-rancher-client1][WARNIN] got monmap epoch 1
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/timeout 300 ceph-osd --cluster ceph --mkfs --mkkey -i 1 --monmap /data/ceph/activate.monmap --osd-data /data/ceph --osd-journal /data/ceph/journal --osd-uuid d7160d58-ff8d-4779-a1b6-cb3a1f645c96 --keyring /data/ceph/keyring --setuser ceph --setgroup ceph
    [docker-rancher-client1][WARNIN] activate: Marking with init system systemd
    [docker-rancher-client1][WARNIN] activate: Authorizing OSD key...
    [docker-rancher-client1][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring auth add osd.1 -i /data/ceph/keyring osd allow * mon allow profile osd
    [docker-rancher-client1][WARNIN] added key for osd.1
    [docker-rancher-client1][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/active.32435.tmp
    [docker-rancher-client1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/active.32435.tmp
    [docker-rancher-client1][WARNIN] activate: ceph osd.1 data dir is ready at /data/ceph
    [docker-rancher-client1][WARNIN] activate_dir: Creating symlink /var/lib/ceph/osd/ceph-1 -> /data/ceph
    [docker-rancher-client1][WARNIN] start_daemon: Starting ceph osd.1...
    [docker-rancher-client1][WARNIN] command_check_call: Running command: /usr/bin/systemctl enable ceph-osd@1
    [docker-rancher-client1][WARNIN] Created symlink from /etc/systemd/system/ceph-osd.target.wants/ceph-osd@1.service to /usr/lib/systemd/system/ceph-osd@.service.
    [docker-rancher-client1][WARNIN] command_check_call: Running command: /usr/bin/systemctl start ceph-osd@1
    [docker-rancher-client1][INFO  ] checking OSD status...
    [docker-rancher-client1][DEBUG ] find the location of an executable
    [docker-rancher-client1][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
    [docker-rancher-client1][WARNIN] there are 2 OSDs down
    [docker-rancher-client1][WARNIN] there are 2 OSDs out
    [docker-rancher-client1][INFO  ] Running command: sudo systemctl enable ceph.target
    [docker-rancher-client2][DEBUG ] connection detected need for sudo
    [docker-rancher-client2][DEBUG ] connected to host: docker-rancher-client2 
    [docker-rancher-client2][DEBUG ] detect platform information from remote host
    [docker-rancher-client2][DEBUG ] detect machine type
    [docker-rancher-client2][DEBUG ] find the location of an executable
    [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
    [ceph_deploy.osd][DEBUG ] activating host docker-rancher-client2 disk /data/ceph
    [ceph_deploy.osd][DEBUG ] will use init type: systemd
    [docker-rancher-client2][DEBUG ] find the location of an executable
    [docker-rancher-client2][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /data/ceph
    [docker-rancher-client2][WARNIN] main_activate: path = /data/ceph
    [docker-rancher-client2][WARNIN] activate: Cluster uuid is ef81681c-ee15-412e-a752-2c3e87b9e369
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
    [docker-rancher-client2][WARNIN] activate: Cluster name is ceph
    [docker-rancher-client2][WARNIN] activate: OSD uuid is 68cf33ef-d805-41df-8683-cd6ff94c8f18
    [docker-rancher-client2][WARNIN] allocate_osd_id: Allocating OSD id...
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd create --concise 68cf33ef-d805-41df-8683-cd6ff94c8f18
    [docker-rancher-client2][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/whoami.102520.tmp
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/whoami.102520.tmp
    [docker-rancher-client2][WARNIN] activate: OSD id is 2
    [docker-rancher-client2][WARNIN] activate: Initializing OSD...
    [docker-rancher-client2][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /data/ceph/activate.monmap
    [docker-rancher-client2][WARNIN] got monmap epoch 1
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/timeout 300 ceph-osd --cluster ceph --mkfs --mkkey -i 2 --monmap /data/ceph/activate.monmap --osd-data /data/ceph --osd-journal /data/ceph/journal --osd-uuid 68cf33ef-d805-41df-8683-cd6ff94c8f18 --keyring /data/ceph/keyring --setuser ceph --setgroup ceph
    [docker-rancher-client2][WARNIN] activate: Marking with init system systemd
    [docker-rancher-client2][WARNIN] activate: Authorizing OSD key...
    [docker-rancher-client2][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring auth add osd.2 -i /data/ceph/keyring osd allow * mon allow profile osd
    [docker-rancher-client2][WARNIN] added key for osd.2
    [docker-rancher-client2][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/active.102520.tmp
    [docker-rancher-client2][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/active.102520.tmp
    [docker-rancher-client2][WARNIN] activate: ceph osd.2 data dir is ready at /data/ceph
    [docker-rancher-client2][WARNIN] activate_dir: Creating symlink /var/lib/ceph/osd/ceph-2 -> /data/ceph
    [docker-rancher-client2][WARNIN] start_daemon: Starting ceph osd.2...
    [docker-rancher-client2][WARNIN] command_check_call: Running command: /usr/bin/systemctl enable ceph-osd@2
    [docker-rancher-client2][WARNIN] Created symlink from /etc/systemd/system/ceph-osd.target.wants/ceph-osd@2.service to /usr/lib/systemd/system/ceph-osd@.service.
    [docker-rancher-client2][WARNIN] command_check_call: Running command: /usr/bin/systemctl start ceph-osd@2
    [docker-rancher-client2][INFO  ] checking OSD status...
    [docker-rancher-client2][DEBUG ] find the location of an executable
    [docker-rancher-client2][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
    [docker-rancher-client2][INFO  ] Running command: sudo systemctl enable ceph.target
    [hub.chinatelecom.cn][DEBUG ] connection detected need for sudo
    [hub.chinatelecom.cn][DEBUG ] connected to host: hub.chinatelecom.cn 
    [hub.chinatelecom.cn][DEBUG ] detect platform information from remote host
    [hub.chinatelecom.cn][DEBUG ] detect machine type
    [hub.chinatelecom.cn][DEBUG ] find the location of an executable
    [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
    [ceph_deploy.osd][DEBUG ] activating host hub.chinatelecom.cn disk /data/ceph
    [ceph_deploy.osd][DEBUG ] will use init type: systemd
    [hub.chinatelecom.cn][DEBUG ] find the location of an executable
    [hub.chinatelecom.cn][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /data/ceph
    [hub.chinatelecom.cn][WARNIN] main_activate: path = /data/ceph
    [hub.chinatelecom.cn][WARNIN] activate: Cluster uuid is ef81681c-ee15-412e-a752-2c3e87b9e369
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
    [hub.chinatelecom.cn][WARNIN] activate: Cluster name is ceph
    [hub.chinatelecom.cn][WARNIN] activate: OSD uuid is 5cba26ef-de7d-4ddc-8fcd-f800a84b8255
    [hub.chinatelecom.cn][WARNIN] allocate_osd_id: Allocating OSD id...
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd create --concise 5cba26ef-de7d-4ddc-8fcd-f800a84b8255
    [hub.chinatelecom.cn][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/whoami.33404.tmp
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/whoami.33404.tmp
    [hub.chinatelecom.cn][WARNIN] activate: OSD id is 3
    [hub.chinatelecom.cn][WARNIN] activate: Initializing OSD...
    [hub.chinatelecom.cn][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /data/ceph/activate.monmap
    [hub.chinatelecom.cn][WARNIN] got monmap epoch 1
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/timeout 300 ceph-osd --cluster ceph --mkfs --mkkey -i 3 --monmap /data/ceph/activate.monmap --osd-data /data/ceph --osd-journal /data/ceph/journal --osd-uuid 5cba26ef-de7d-4ddc-8fcd-f800a84b8255 --keyring /data/ceph/keyring --setuser ceph --setgroup ceph
    [hub.chinatelecom.cn][WARNIN] activate: Marking with init system systemd
    [hub.chinatelecom.cn][WARNIN] activate: Authorizing OSD key...
    [hub.chinatelecom.cn][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring auth add osd.3 -i /data/ceph/keyring osd allow * mon allow profile osd
    [hub.chinatelecom.cn][WARNIN] added key for osd.3
    [hub.chinatelecom.cn][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/active.33404.tmp
    [hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/active.33404.tmp
    [hub.chinatelecom.cn][WARNIN] activate: ceph osd.3 data dir is ready at /data/ceph
    [hub.chinatelecom.cn][WARNIN] activate_dir: Creating symlink /var/lib/ceph/osd/ceph-3 -> /data/ceph
    [hub.chinatelecom.cn][WARNIN] start_daemon: Starting ceph osd.3...
    [hub.chinatelecom.cn][WARNIN] command_check_call: Running command: /usr/bin/systemctl enable ceph-osd@3
    [hub.chinatelecom.cn][WARNIN] Created symlink from /etc/systemd/system/ceph-osd.target.wants/ceph-osd@3.service to /usr/lib/systemd/system/ceph-osd@.service.
    [hub.chinatelecom.cn][WARNIN] command_check_call: Running command: /usr/bin/systemctl start ceph-osd@3
    [hub.chinatelecom.cn][INFO  ] checking OSD status...
    [hub.chinatelecom.cn][DEBUG ] find the location of an executable
    [hub.chinatelecom.cn][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
    [hub.chinatelecom.cn][INFO  ] Running command: sudo systemctl enable ceph.target
    

    拷贝密钥

    # 用 ceph-deploy 把配置文件和 admin 密钥拷贝到管理节点和 Ceph 节点,这样你每次执行 Ceph 命令行时就无需指定 monitor 地址和 ceph.client.admin.keyring 了
    [op@docker-rancher-server ceph]$ ceph-deploy admin docker-rancher-server docker-rancher-client1
    [ceph_deploy.conf][DEBUG ] found configuration file at: /usr/op/.cephdeploy.conf
    [ceph_deploy.cli][INFO  ] Invoked (1.5.36): /usr/bin/ceph-deploy admin docker-rancher-server docker-rancher-client1
    [ceph_deploy.cli][INFO  ] ceph-deploy options:
    [ceph_deploy.cli][INFO  ]  username                      : None
    [ceph_deploy.cli][INFO  ]  verbose                       : False
    [ceph_deploy.cli][INFO  ]  overwrite_conf                : False
    [ceph_deploy.cli][INFO  ]  quiet                         : False
    [ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x26724d0>
    [ceph_deploy.cli][INFO  ]  cluster                       : ceph
    [ceph_deploy.cli][INFO  ]  client                        : ['docker-rancher-server', 'docker-rancher-client1']
    [ceph_deploy.cli][INFO  ]  func                          : <function admin at 0x7faecb068050>
    [ceph_deploy.cli][INFO  ]  ceph_conf                     : None
    [ceph_deploy.cli][INFO  ]  default_release               : False
    [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to docker-rancher-server
    [docker-rancher-server][DEBUG ] connection detected need for sudo
    [docker-rancher-server][DEBUG ] connected to host: docker-rancher-server 
    [docker-rancher-server][DEBUG ] detect platform information from remote host
    [docker-rancher-server][DEBUG ] detect machine type
    [docker-rancher-server][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
    [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to docker-rancher-client1
    [docker-rancher-client1][DEBUG ] connection detected need for sudo
    [docker-rancher-client1][DEBUG ] connected to host: docker-rancher-client1 
    [docker-rancher-client1][DEBUG ] detect platform information from remote host
    [docker-rancher-client1][DEBUG ] detect machine type
    [docker-rancher-client1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
    

    确保你对 ceph.client.admin.keyring 有正确的操作权限

    # 只在admin主机执行即可
    sudo chmod +r /etc/ceph/ceph.client.admin.keyring
    

    检查集群状态

    [op@docker-rancher-server ceph]$ ceph health
    HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds; 64 pgs stuck inactive
    # 出现这个问题,常见报错解决方案3
    

    正常的情况如下:

    [op@docker-rancher-server ceph]$ ceph health
    HEALTH_OK
    [op@docker-rancher-server ceph]$ ceph -s
        cluster ef81681c-ee15-412e-a752-2c3e87b9e369
         health HEALTH_OK
         monmap e1: 2 mons at {docker-rancher-client1=10.142.246.3:6789/0,docker-rancher-server=10.142.246.2:6789/0}
                election epoch 8, quorum 0,1 docker-rancher-server,docker-rancher-client1
         osdmap e18: 4 osds: 4 up, 4 in
                flags sortbitwise
          pgmap v182: 64 pgs, 1 pools, 0 bytes data, 0 objects
                281 GB used, 3455 GB / 3936 GB avail
                      64 active+clean
                      
    [op@docker-rancher-server ceph]$ ceph osd tree
    ID WEIGHT  TYPE NAME                       UP/DOWN REWEIGHT PRIMARY-AFFINITY 
    -1 3.84436 root default                                                      
    -2 0.96109     host docker-rancher-server                                    
     0 0.96109         osd.0                        up  1.00000          1.00000 
    -3 0.96109     host docker-rancher-client1                                   
     1 0.96109         osd.1                        up  1.00000          1.00000 
    -4 0.96109     host docker-rancher-client2                                   
     2 0.96109         osd.2                        up  1.00000          1.00000 
    -5 0.96109     host hub                                                      
     3 0.96109         osd.3                        up  1.00000          1.00000 
    

    4. 增加块

    默认安装完成后,会有个rbd的存储池。由于本ceph环境当前只用于docker volume后端存储,所以直接用默认的rbd存储池,后期生产环境如果多个系统使用,则构建volume自己的存储池。

    1. 查看资源池

    以下操作要在admin节点执行

    # 列出已存在的存储池
    [op@docker-rancher-server ~]$ rados lspools
    rbd
    [op@docker-rancher-server ~]$ ceph df
    GLOBAL:
        SIZE      AVAIL     RAW USED     %RAW USED 
        3936G     3455G         281G          7.14 
    POOLS:
        NAME     ID     USED     %USED     MAX AVAIL     OBJECTS 
        rbd      0         0         0         1072G           0 
    rbd默认有1072G可用
    如果后期磁盘空间不够用,可以将size的个数调整
    [op@docker-rancher-server ~]$ ceph osd pool set rbd size 2
    set pool 0 size to 2
    [op@docker-rancher-server ~]$  ceph df 
    GLOBAL:
        SIZE      AVAIL     RAW USED     %RAW USED 
        3936G     3455G         281G          7.14 
    POOLS:
        NAME     ID     USED     %USED     MAX AVAIL     OBJECTS 
        rbd      0         0         0         1608G           0 
    

    创建块,该块设备推荐使用format 2的格式,这样后期可以做镜像和快照。但是问题来了,由于内核版本是3.10,不支持format 2的部分新特性。常见常见错误5,此处改用format 1 默认格式。

    2. 创建块设备与映射

    # 创建1T
    [op@docker-rancher-server ceph]$ rbd create docker-volume --size 1T --pool rbd  --image-format 1
    rbd: image format 1 is deprecated
    # 此处提示1 已经是废弃的了
    # 补充一点命令
    [op@docker-rancher-server ceph]$ rbd help create
    usage: rbd create [--pool <pool>] [--image <image>] 
                      [--image-format <image-format>] [--new-format] 
                      [--order <order>] [--object-size <object-size>] 
                      [--image-feature <image-feature>] [--image-shared] 
                      [--stripe-unit <stripe-unit>] 
                      [--stripe-count <stripe-count>] 
                      [--journal-splay-width <journal-splay-width>] 
                      [--journal-object-size <journal-object-size>] 
                      [--journal-pool <journal-pool>] --size <size> 
                      <image-spec> 
    
    Create an empty image.
    
    Positional arguments
      <image-spec>              image specification
                                (example: [<pool-name>/]<image-name>)
    
    Optional arguments
      -p [ --pool ] arg         pool name
      --image arg               image name
      --image-format arg        image format [1 (deprecated) or 2]
      --new-format              use image format 2
                                (deprecated)
      --order arg               object order [12 <= order <= 25]
      --object-size arg         object size in B/K/M [4K <= object size <= 32M]
      --image-feature arg       image features
                                [layering(+), striping, exclusive-lock(+*),
                                object-map(+*), fast-diff(+*), deep-flatten(+-),
                                journaling(*)]
      --image-shared            shared image
      --stripe-unit arg         stripe unit
      --stripe-count arg        stripe count
      --journal-splay-width arg number of active journal objects
      --journal-object-size arg size of journal objects
      --journal-pool arg        pool for journal objects
      -s [ --size ] arg         image size (in M/G/T)
    
    Image Features:
      (*) supports enabling/disabling on existing images
      (-) supports disabling-only on existing images
      (+) enabled by default for new images if features not specified
    
    # 验证
    [op@docker-rancher-server ceph]$ rbd ls
    docker-volume
    # 查看详情
    [op@docker-rancher-server ceph]$ rbd info docker-volume
    rbd image 'docker-volume':
    	size 1024 GB in 262144 objects
    	order 22 (4096 kB objects)
    	block_name_prefix: rb.0.acb6.2ae8944a
    	format: 1
    
    # 此步骤备选;后期如果磁盘不够,可以用以扩展。resize可大,也可小。根据需求定。
    rbd resizs docker-volume --size 更改的值  
    
    # 映射块设备,注意此处要用sudo来做,否则报错,无法写入。
    [op@docker-rancher-server ceph]$ sudo rbd map docker-volume --pool rbd --id admin
    /dev/rbd0
    
    # 验证映射
    [op@docker-rancher-server ceph]$ rbd showmapped
    id pool image         snap device    
    0  rbd  docker-volume -    /dev/rbd0 
    
    # 备注,取消映射的方法:
    rbd unmap /dev/rbd/{pool-nmae}/{image-name}
    rbd unmap /dev/rbd/rbd/docker-volume
    # 或者使用
    rbd unmap /dev/rbd0
    

    3. 使用块设备

    # 首先格式化块
    [op@docker-rancher-server ceph]$ sudo mkfs.ext4 -q /dev/rbd0
    
    # 建立Linux挂载目录
    [root@docker-rancher-server ~]# sudo mkdir /ceph-rbd
    
    # 挂载
    [op@docker-rancher-server ceph]$ sudo mount -t /dev/rbd0 /ceph-rbd
    
    # 配置开机自动挂载
    1. 修改ceph自动挂载
    [op@docker-rancher-server ~]$ sudo vim /etc/ceph/rbdmap 
    # RbdDevice             Parameters
    #poolname/imagename     id=client,keyring=/etc/ceph/ceph.client.keyring
    rbd/docker-volume       id=admin,keyring=/etc/ceph/ceph.client.admin.keyring
    
    2. 修改fstab开机自启动,增加挂载项
    vim /etc/fstab
    /dev/rbd0                   /ceph-rbd   ext4  defaults        1 2
    
    
    # 验证
    Filesystem                  Size  Used Avail Use% Mounted on
    /dev/mapper/centos-root      45G  2.9G   42G   7% /
    devtmpfs                     16G     0   16G   0% /dev
    tmpfs                        16G     0   16G   0% /dev/shm
    tmpfs                        16G  1.5G   15G  10% /run
    tmpfs                        16G     0   16G   0% /sys/fs/cgroup
    /dev/mapper/datavg-lv_data  985G   62G  873G   7% /data
    /dev/xvda1                  497M  135M  362M  28% /boot
    10.142.246.2:/data/nfs      985G   62G  873G   7% /var/lib/rancher/convoy/convoy-nfs-85120bf6-2d8d-44e1-b868-bde8284a3b4c/mnt
    tmpfs                       3.1G     0  3.1G   0% /run/user/0
    /dev/rbd0                  1008G   77M  957G   1% /ceph-rbd
    
    # 现在可以去ceph-rbd里面创建个文件或者其他东西
    
    

    4. 未来删除块设备的流程

    # 1. 取消挂载
    umount /ceph-rbd
    # 2. 先去fstab和rbdmap里面删除增加的开机自动挂载信息,否则下次开机无法启动。
    # 注意,一定要将ceph设置为开机自动启动,否则也是无法开机
    
    # 3. 取消映射
    rbd unmap /dev/rbd0
    # 验证
    rbd showmapped
    # 4. 删除对应的快
    rbd rm docker-volume
    

    附录

    1. 增加mon节点

    上述配置过程中只配置了两个mon,生产环境应该配置3个或以上的mon节点

    # 编辑配置文件,增加mon
    [op@docker-rancher-server ceph]$ vim ceph.conf 
    mon_initial_members = docker-rancher-server, docker-rancher-client1, docker-rancher-client2
    mon_host = 10.142.246.2,10.142.246.3,10.142.246.4
    
    # 增加mon
    [op@docker-rancher-server ceph]$ ceph-deploy --overwrite-conf   mon  create docker-rancher-client2
    
    # 同步其他节点的配置
    [op@docker-rancher-server ceph]$ ceph-deploy  --overwrite-conf config push docker-rancher-server
    [op@docker-rancher-server ceph]$ ceph-deploy  --overwrite-conf config push docker-rancher-client1
    # 检查结果
    [op@docker-rancher-server ceph]$ ceph -s
        cluster ef81681c-ee15-412e-a752-2c3e87b9e369
         health HEALTH_OK
         monmap e2: 3 mons at {docker-rancher-client1=10.142.246.3:6789/0,docker-rancher-client2=10.142.246.4:6789/0,docker-rancher-server=10.142.246.2:6789/0}
                election epoch 10, quorum 0,1,2 docker-rancher-server,docker-rancher-client1,docker-rancher-client2
         osdmap e28: 4 osds: 4 up, 4 in
                flags sortbitwise
          pgmap v10836: 64 pgs, 1 pools, 0 bytes data, 0 objects
                281 GB used, 3454 GB / 3936 GB avail
                      64 active+clean
    
    可以看到已经有3个mon了
    

    2. 删除mon节点

    # 先去修改ceph.conf文件,删除对应的mon
    
    # 再去推送到其他mon节点
    
    # 执行删除
    ceph-deploy mon destroy  $HOSTNAME
    
    # 检测
    ceph -s
    

    2. 增加删除osd

    1. 增加osd

    # 增加的过程和之前部署的一样
    # 此处以文件夹为代表演示
    # 1. 先创建对应文件夹,将权限更改为ceph
    # 2. ceph-deploy osd prepare $hostname:目录
    # 3.  ceph-deploy osd activate $hostname:目录
    # 4. 使用ceph -s和ceph osd tree查看
    

    2. 删除osd

    1. 停进程
    # 检查当前osd
    ceph osd tree
    
    1. 其他具体删除时查看官网

    测试块读写速度

    1. 测试Linux磁盘读写速度

    # 测试写速度
    [root@docker-rancher-server ~]# time dd if=/dev/zero of=/test.dbf bs=8k count=300000
    300000+0 records in
    300000+0 records out
    2457600000 bytes (2.5 GB) copied, 2.60019 s, 945 MB/s
    
    real	0m2.602s
    user	0m0.054s
    sys	0m2.542s
    # 测试读速度
    [root@docker-rancher-server ~]# time dd if=/test.dbf of=/dev/null bs=8k count=300000 
    300000+0 records in
    300000+0 records out
    2457600000 bytes (2.5 GB) copied, 0.804974 s, 3.1 GB/s
    
    real	0m0.806s
    user	0m0.028s
    sys	0m0.778s
    

    2. 测试挂载的/data数据盘读写速度

    # 测试写速度
    [root@docker-rancher-server ceph-rbd]# time dd if=/dev/zero of=/data/test.dbf bs=8k count=300000
    300000+0 records in
    300000+0 records out
    2457600000 bytes (2.5 GB) copied, 3.38757 s, 725 MB/s
    
    real	0m3.407s
    user	0m0.053s
    sys	0m3.248s
    # 测试读速度
    [root@docker-rancher-server ~]# time dd if=/data/test.dbf of=/dev/null bs=8k count=300000 
    300000+0 records in
    300000+0 records out
    2457600000 bytes (2.5 GB) copied, 0.899513 s, 2.7 GB/s
    
    real	0m0.901s
    user	0m0.029s
    sys	0m0.872s
    

    3. 测试ceph 块存储读写速度

    # 测试写速度
    [root@docker-rancher-server ceph-rbd]# time dd if=/dev/zero of=/ceph-rbd/test.dbf bs=8k count=300000
    300000+0 records in
    300000+0 records out
    2457600000 bytes (2.5 GB) copied, 3.31538 s, 741 MB/s
    
    real	0m3.335s
    user	0m0.060s
    sys	0m3.253s0
    # 测试读速度
    [root@docker-rancher-server ceph-rbd]# time dd if=/ceph-rbd/test.dbf of=/dev/null bs=8k count=300000
    300000+0 records in
    300000+0 records out
    2457600000 bytes (2.5 GB) copied, 0.963309 s, 2.6 GB/s
    
    real	0m0.965s
    user	0m0.024s
    sys	0m0.938s
    

    由于测试环境等各种原因,并不能很全面反映ceph 快存储的读写速度,大概可以看出,和华为云平台挂载的数据盘读写速度差不多。原因可能是本次ceph用的就是华为云平台挂载的数据盘。后期具体生产环境可以再次测试一下。不过可以大概了解到网络方面对读写的影响不是很大。在本次实验中并不是一个影响很大的因素。

    常见错误

    1. 执行 ceph-deploy new 创建监视器时报错

    # 报错代码
    [op@docker-rancher-server ceph]$ ceph-deploy new docker-rancher-server docker-rancher-client1
    Traceback (most recent call last):
      File "/usr/bin/ceph-deploy", line 18, in <module>
        from ceph_deploy.cli import main
    ImportError: No module named ceph_deploy.cli
    

    产生原因 : 由于之前升级了CentOS7默认的python版本导致的,解决方法是修改ceph-deploy,使其指向默认python版本

    解决方法:

    [op@docker-rancher-server ceph]$ sudo vim /usr/bin/ceph-deploy
    将#!/usr/bin/env python
    修改为#!/usr/bin/python2.7
    

    2. 执行ceph-deploy时报错,无法去ceph官网下包

    [op@docker-rancher-server ceph]$ ceph-deploy install docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn
    ......
    [docker-rancher-server][DEBUG ] 完毕!
    [docker-rancher-server][DEBUG ] Configure Yum priorities to include obsoletes
    [docker-rancher-server][WARNIN] check_obsoletes has been enabled for Yum priorities plugin
    [docker-rancher-server][INFO  ] Running command: sudo rpm --import https://download.ceph.com/keys/release.asc
    [docker-rancher-server][WARNIN] curl: (6) Could not resolve host: download.ceph.com; 未知的名称或服务
    [docker-rancher-server][WARNIN] 错误:https://download.ceph.com/keys/release.asc: import read failed(2).
    [docker-rancher-server][ERROR ] RuntimeError: command returned non-zero exit status: 1
    [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: rpm --import https://download.ceph.com/keys/release.asc
    

    问题原因:由于内网无法联网,导致无法去官网下包

    解决办法:修改ceph-deploy中对应下载地址到自己的ceph源。

    修改3个文件

    # 1. 修改第一个文件
    [op@docker-rancher-server ~]$ cd /usr/lib/python2.7/site-packages/ceph_deploy/hosts/centos/
    [op@docker-rancher-server centos]$ sudo vim install.py
     79 #            remoto.process.run(
     80 #                distro.conn,
     81 #                [
     82 #                    'rpm',
     83 #                    '-Uvh',
     84 #                    '--replacepkgs',
     85 #                    '{url}noarch/ceph-release-1-0.{dist}.noarch.rpm'.format(url=url, dist=dist),
     86 #                ],
     87 #           )
    
     # 2. 修改第二个文件
    [op@docker-rancher-server centos]$ cd /usr/lib/python2.7/site-packages/ceph_deploy/util/
    [op@docker-rancher-server util]$ sudo vim constants.py
    # 修改为自己的keys地址
    32 gpg_key_base_url = "10.142.78.40/ceph/keys/"
    
    # 3. 修改第三个文件
    [op@docker-rancher-server util]$ cd /usr/lib/python2.7/site-packages/ceph_deploy/util/paths/
    [op@docker-rancher-server paths]$ sudo vim gpg.py
    # 把https改为http
     3 def url(key_type, protocol="http"):
      4     return "{protocol}://{url}{key_type}.asc".format(
      5         protocol=protocol,
      6         url=constants.gpg_key_base_url,
      7         key_type=key_type
      8     )
     
    # 将三个文件同步到其他各个节点
    #! /bin/bash
    set -ex
    hosts="docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn"
    
    file="/usr/lib/python2.7/site-packages/ceph_deploy/hosts/centos/install.py /usr/lib/python2.7/site-packages/ceph_deploy/util/constants.py /usr/lib/python2.7/site-packages/ceph_deploy/util/paths/gpg.py"
    
    destinationDirectory="~"
    
    for i in $hosts
    do
        scp $file $i:$destinationDirectory
        ssh $i sudo mv $destinationDirectory/install.py /usr/lib/python2.7/site-packages/ceph_deploy/hosts/centos/
        ssh $i sudo mv $destinationDirectory/constants.py  /usr/lib/python2.7/site-packages/ceph_deploy/util/
        ssh $i sudo mv $destinationDirectory/gpg.py  /usr/lib/python2.7/site-packages/ceph_deploy/util/paths/
    done
    

    该修改参考资料

    本地源安装ceph

    [op@docker-rancher-server ceph]$ ceph-deploy install docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn
    ......
    [docker-rancher-server][DEBUG ] ---> 软件包 spax.x86_64.0.1.5.2-13.el7 将被 安装
    [docker-rancher-server][DEBUG ] ---> 软件包 time.x86_64.0.1.7-45.el7 将被 安装
    [docker-rancher-server][DEBUG ] --> 解决依赖关系完成
    [docker-rancher-server][DEBUG ]  您可以尝试添加 --skip-broken 选项来解决该问题
    [docker-rancher-server][WARNIN] 错误:软件包:1:ceph-selinux-10.2.3-0.el7.x86_64 (ceph)
    [docker-rancher-server][WARNIN]           需要:selinux-policy-base >= 3.13.1-60.el7_2.7
    [docker-rancher-server][WARNIN]           已安装: selinux-policy-targeted-3.13.1-23.el7.noarch (@anaconda)
    [docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-23.el7
    [docker-rancher-server][WARNIN]           可用: selinux-policy-minimum-3.13.1-23.el7.noarch (base)
    [docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-23.el7
    [docker-rancher-server][WARNIN]           可用: selinux-policy-minimum-3.13.1-60.el7.noarch (updates)
    [docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-60.el7
    [docker-rancher-server][WARNIN]           可用: selinux-policy-mls-3.13.1-23.el7.noarch (base)
    [docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-23.el7
    [docker-rancher-server][WARNIN]           可用: selinux-policy-mls-3.13.1-60.el7.noarch (updates)
    [docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-60.el7
    [docker-rancher-server][WARNIN]           可用: selinux-policy-targeted-3.13.1-60.el7.noarch (updates)
    [docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-60.el7
    [docker-rancher-server][WARNIN] 错误:软件包:1:python-flask-0.10.1-3.el7.noarch (epel)
    [docker-rancher-server][WARNIN]           需要:python-itsdangerous
    [docker-rancher-server][DEBUG ]  您可以尝试执行:rpm -Va --nofiles --nodigest
    [docker-rancher-server][ERROR ] RuntimeError: command returned non-zero exit status: 1
    [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: yum -y install ceph ceph-radosgw
    

    这是yum源里的selinux-policy-targeted版本不够造成的,下载对应的包放进去

    selinux-policy-3.13.1-60.el7_2.7.noarch.rpm

    selinux-policy-targeted-3.13.1-60.el7_2.7.noarch.rpm

    # 在所有的节点都执行
    [op@docker-rancher-server ~]$ sudo yum localinstall selinux-policy-3.13.1-60.el7_2.7.noarch.rpm 
    [op@docker-rancher-server ~]$ sudo yum localinstall selinux-policy-targeted-3.13.1-60.el7_2.7.noarch.rpm
    

    3. ceph健康检查一个osd都没有

    [op@docker-rancher-server ceph]$ ceph health
    HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds; 64 pgs stuck inactive
    [op@docker-rancher-server ceph]$ ceph -s
        cluster ef81681c-ee15-412e-a752-2c3e87b9e369
         health HEALTH_ERR
                64 pgs are stuck inactive for more than 300 seconds
                64 pgs stuck inactive
         monmap e1: 2 mons at {docker-rancher-client1=10.142.246.3:6789/0,docker-rancher-server=10.142.246.2:6789/0}
                election epoch 8, quorum 0,1 docker-rancher-server,docker-rancher-client1
         osdmap e9: 4 osds: 0 up, 0 in
                flags sortbitwise
          pgmap v10: 64 pgs, 1 pools, 0 bytes data, 0 objects
                0 kB used, 0 kB / 0 kB avail
                      64 creating
    

    经排查错误日志:

    [root@docker-rancher-client2 data]# tail -f /var/log/ceph/ceph-osd.2.log 
    2016-11-28 14:25:03.603389 7f0ccf069800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
    2016-11-28 14:25:03.607758 7f0ccf069800  0 filestore(/var/lib/ceph/osd/ceph-2) limited size xattrs
    2016-11-28 14:25:03.608339 7f0ccf069800  1 leveldb: Recovering log #16
    2016-11-28 14:25:03.613865 7f0ccf069800  1 leveldb: Delete type=0 #16
    
    2016-11-28 14:25:03.613927 7f0ccf069800  1 leveldb: Delete type=3 #15
    
    2016-11-28 14:25:03.614165 7f0ccf069800  0 filestore(/var/lib/ceph/osd/ceph-2) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
    2016-11-28 14:25:03.614331 7f0ccf069800 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
    2016-11-28 14:25:03.614341 7f0ccf069800  1 journal _open /var/lib/ceph/osd/ceph-2/journal fd 18: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 0
    2016-11-28 14:25:03.614665 7f0ccf069800  1 journal _open /var/lib/ceph/osd/ceph-2/journal fd 18: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 0
    2016-11-28 14:25:03.614947 7f0ccf069800  1 filestore(/var/lib/ceph/osd/ceph-2) upgrade
    2016-11-28 14:25:03.615114 7f0ccf069800 -1 osd.2 0 backend (filestore) is unable to support max object name[space] len
    2016-11-28 14:25:03.615140 7f0ccf069800 -1 osd.2 0    osd max object name len = 2048
    2016-11-28 14:25:03.615142 7f0ccf069800 -1 osd.2 0    osd max object namespace len = 256
    2016-11-28 14:25:03.615144 7f0ccf069800 -1 osd.2 0 (36) File name too long
    2016-11-28 14:25:03.615498 7f0ccf069800  1 journal close /var/lib/ceph/osd/ceph-2/journal
    2016-11-28 14:25:03.616473 7f0ccf069800 -1  ** ERROR: osd init failed: (36) File name too long
    

    log意思是说,文件名太长。各种google搜索一番后,发现原来我用的文件系统是ext4,CentOS推荐使用xfs的文件系统。但是磁盘不能重新格式化,所以我就在ceph配置文件中增加参数,限制文件名的长度。

    # 注意,四个节点都要做
    [op@docker-rancher-client2 ~]$ sudo vim /etc/ceph/ceph.conf 
    osd max object name len = 256
    osd max object namespace len = 64
    
    # 之后重启osd服务
    [op@docker-rancher-client2 ~]$ sudo systemctl restart  ceph-osd.target 
    

    参考资料:

    The ceph OSD deamon is not activated with ext4 file system

    http://tracker.ceph.com/issues/16187

    4. rbd create失败

    在配置好集群以后,rbd create一直失败,情况如下

    [op@docker-rancher-server ceph]$ rbd create docker-volume --size 1024
    2016-11-29 12:22:57.485826 7faa2e05c700  0 -- 10.142.246.2:0/3639593179 >> 10.142.246.5:6800/109587 pipe(0x7faa57bb46c0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7faa57bb5980).fault
    

    但是整个集群的状态是好的

    [op@docker-rancher-server ceph]$ ceph osd tree
    ID WEIGHT  TYPE NAME                       UP/DOWN REWEIGHT PRIMARY-AFFINITY 
    -1 3.84436 root default                                                      
    -2 0.96109     host docker-rancher-server                                    
     0 0.96109         osd.0                        up  1.00000          1.00000 
    -3 0.96109     host docker-rancher-client1                                   
     1 0.96109         osd.1                        up  1.00000          1.00000 
    -4 0.96109     host docker-rancher-client2                                   
     2 0.96109         osd.2                        up  1.00000          1.00000 
    -5 0.96109     host hub                                                      
     3 0.96109         osd.3                        up  1.00000          1.00000 
    [op@docker-rancher-server ceph]$ ceph -s
        cluster ef81681c-ee15-412e-a752-2c3e87b9e369
         health HEALTH_OK
         monmap e2: 3 mons at {docker-rancher-client1=10.142.246.3:6789/0,docker-rancher-client2=10.142.246.4:6789/0,docker-rancher-server=10.142.246.2:6789/0}
                election epoch 18, quorum 0,1,2 docker-rancher-server,docker-rancher-client1,docker-rancher-client2
         osdmap e82: 4 osds: 4 up, 4 in
                flags sortbitwise
          pgmap v42631: 64 pgs, 1 pools, 0 bytes data, 0 objects
                283 GB used, 3452 GB / 3936 GB avail
                      64 active+clean
    

    后来换了个节点docker-rancher-client1节点,查看osd的日志信息

    [root@docker-rancher-client1 ceph]# tail -f ceph-osd.1.log 
    2016-11-29 12:26:32.770143 7ffbc37c5700 -1 osd.1 82 heartbeat_check: no reply from osd.3 ever on either front or back, first ping sent 2016-11-29 12:21:21.113662 (cutoff 2016-11-29 12:26:12.770139)
    2016-11-29 12:26:33.558821 7ffbaadfe700 -1 osd.1 82 heartbeat_check: no reply from osd.3 ever on either front or back, first ping sent 2016-11-29 12:21:21.113662 (cutoff 2016-11-29 12:26:13.558819)
    2016-11-29 12:26:33.770524 7ffbc37c5700 -1 osd.1 82 heartbeat_check: no reply from osd.3 ever on either front or back, first ping sent 2016-11-29 12:21:21.113662 (cutoff 2016-11-29 12:26:13.770520)
    2016-11-29 12:26:34.659291 7ffbaadfe700 -1 osd.1 82 heartbeat_check: no reply from osd.3 ever on either front or back, first ping sent 2016-11-29 12:21:21.113662 (cutoff 2016-11-29 12:26:14.659289)
    2016-11-29 12:26:34.770786 7ffbc37c5700 -1 osd.1 82 heartbeat_check: no reply from osd.3 ever on either front or back, first ping sent 2016-11-29 12:21:21.113662 (cutoff 2016-11-29 12:26:14.770781)
    

    发现原来是osd.3心跳检查不通过。经排查是防火墙问题

    [op@hub hue-metadata]$ sudo iptables -I INPUT -p tcp --dport 6789 -j ACCEPT
    [op@hub hue-metadata]$ sudo iptables -I INPUT -p tcp -m multiport --dports 6800:7100 -j ACCEPT
    # 再看一下iptables filter表
    [op@hub hue-metadata]$ sudo iptables -L -n
    Chain INPUT (policy ACCEPT)
    target     prot opt source               destination         
    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 6800:7100
    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:6789
    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8080
    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8888
    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8001
    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:443
    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:10050
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
    ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0           
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:22
    REJECT     all  --  0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited
    ACCEPT     tcp  --  10.142.0.0/16        0.0.0.0/0            tcp dpt:6789
    ACCEPT     tcp  --  10.142.0.0/16        0.0.0.0/0            tcp dpts:6800:7300
    ACCEPT     all  --  10.142.0.0/16        0.0.0.0/0 
    # 其实之前配了,可能是因为配置源IP地址的原因吧,现在配置的是任意源地址都可以通过,至此,问题得以解决。
    

    5. rbd map失败

    创建rbd后,打算映射一下,但是报错

    [op@docker-rancher-server ceph]$ rbd map docker-volume --pool rbd --id admin
    modprobe: ERROR: could not insert 'rbd': Operation not permitted
    rbd: failed to load rbd kernel module (1)
    rbd: sysfs write failed
    In some cases useful info is found in syslog - try "dmesg | tail" or so.
    rbd: map failed: (2) No such file or directory
    [op@docker-rancher-server ceph]$ sudo rbd map docker-volume --pool rbd --id admin
    rbd: sysfs write failed
    RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable".
    In some cases useful info is found in syslog - try "dmesg | tail" or so.
    rbd: map failed: (6) No such device or address
    

    故障排查:

    rbd 块ceph 支持两种格式:1和2

    format 1 - 新建 rbd 映像时使用最初的格式。此格式兼容所有版本的 librbd 和内核模块,但是不支持较新的功能,像克隆。

    format 2 - 使用第二版 rbd 格式, librbd 和 3.11 版以上内核模块才支持(除非是分拆的模块)。此格式增加了克隆支持,使得扩展更容易,还允许以后增加新功能。

    为使用rbd 块新特性,使用格式2,在map 时发生以上报错:

    查找官网相关资料,找到信息如下:

    我们安装的是jewel 版本,新建rbd块指定格式2,默认格式2的rbd 块支持如下特性,默认全部开启;

    layering: 支持分层

    striping: 支持条带化 v2

    exclusive-lock: 支持独占锁

    object-map: 支持对象映射(依赖 exclusive-lock )

    fast-diff: 快速计算差异(依赖 object-map )

    deep-flatten: 支持快照扁平化操作

    journaling: 支持记录 IO 操作(依赖独占锁)

    笔者使用系统为centos7.1 ,内核版本3.10.0-229.el7.x86_64,根据报错内容提示可知,服务器系统内核版本,不支持有些格式2 的新特性导致。可以使用--image-feature   选项指定使用特性,不用全部开启。我们的需求仅需要使用快照等特性,开启layering即可,

    经测试,内核版本 3.10,仅支持此特性(layering),其它特性需要使用更高版本内核,或者从新编译内核加载特性模块才行。

    参考资料

    1. ceph集群jewel版本 rbd 块map 报错-故障排查

    2. RBD – MANAGE RADOS BLOCK DEVICE (RBD) IMAGES

    参考资料

    1. Ceph官网-中文版
    2. Ceph官网-英文版
    3. 本地源安装ceph
    既然优秀不够,那就让自己无可替代
  • 相关阅读:
    [翻译] GCDObjC
    [翻译] ValueTrackingSlider
    [翻译] SWTableViewCell
    使用 NSPropertyListSerialization 持久化字典与数组
    [翻译] AsyncImageView 异步下载图片
    KVC中setValuesForKeysWithDictionary:
    [翻译] Working with NSURLSession: AFNetworking 2.0
    数据库引擎
    什么是数据库引擎
    网站添加百度分享按钮代码实例
  • 原文地址:https://www.cnblogs.com/icloud/p/6115447.html
Copyright © 2011-2022 走看看