zoukankan      html  css  js  c++  java
  • 使用 MegaCLI 检测磁盘状态并更换磁盘

    https://my.oschina.net/adailinux/blog/2231519

    之前写了一篇文章介绍如何更换线上服务器磁盘操作流程,当时是把整体机器的磁盘全部不换掉了,但是最近另一台机器部分磁盘损坏,raid类型为10,经检测,只需要更换坏掉的磁盘即可,补充文档如下。

    安装MegaCLI

    安装包 下载地址 。

    安装过程

    # 首先下载获取安装包
    # 解压
    $ tar -zxf MegaCli8.07.10.tar.gz
    $ cd MegaCli8.07.10/Linux/
    $ rpm -ivh Lib_Utils-1.00-09.noarch.rpm MegaCli-8.02.21-1.noarch.rpm
    
    # 加入系统环境
    $ ln -s /opt/MegaRAID/MegaCli/MegaCli64 /usr/local/bin/MegaCli 
    $ MegaCli -v                               
          MegaCLI SAS RAID Management Tool  Ver 8.02.21 Oct 21, 2011
    
        (c)Copyright 2011, LSI Corporation, All Rights Reserved.
    
    Exit Code: 0x00
    # 安装完成!
    
    • 冲突处理:

      $ rpm -ivh Lib_Utils-1.00-09.noarch.rpm MegaCli-8.02.21-1.noarch.rpm 
      准备中...                          ################################# [100%]
      	file /opt/lsi/3rdpartylibs/x86_64/libsysfs.so.2.0.2 from install of Lib_Utils-1.00-09.noarch conflicts with file from package srvadmin-storelib-sysfs-9.1.0-2757.12163.el7.x86_64
      
    • 原因: Lib_Utils和Dell服务器自带的包srvadmin冲突,直接将其卸载,然后安装即可。

      rpm -e srvadmin-storelib-sysfs-9.1.0-2757.12163.el7.x86_64 --nodeps
      

    使用指南

    基本用法

    # 查raid级别
    $ megacli -LDInfo -Lall -aALL 
    
    # 查raid卡信息
    $ megacli -AdpAllInfo -aALL 
    
    # 查看硬盘信息
    $ megacli -PDList -aALL 
    
    # 查看电池信息
    $ megacli -AdpBbuCmd -aAll 
    
    # 查看raid卡日志
    $ megacli -FwTermLog -Dsply -aALL 
    
    # 显示适配器个数
    $ megacli -adpCount 
    
    # 显示适配器时间
    $ megacli -AdpGetTime –aALL 
    
    # 显示所有适配器信息
    $ megacli -AdpAllInfo -aAll     
    
    # 显示所有逻辑磁盘组信息
    $ megacli -LDInfo -LALL -aAll    
    
    # 显示所有的物理信息
    $ megacli -PDList -aAll     
    
    # 查看充电状态
    $ megacli -AdpBbuCmd -GetBbuStatus -aALL |grep 'Charger Status' 
    
    # 显示BBU状态信息
    $ megacli -AdpBbuCmd -GetBbuStatus -aALL 
    
    # 显示BBU容量信息
    $ megacli -AdpBbuCmd -GetBbuCapacityInfo -aALL 
    
    # 显示BBU设计参数
    $ megacli -AdpBbuCmd -GetBbuDesignInfo -aALL    
    
    # 显示当前BBU属性
    $ megacli -AdpBbuCmd -GetBbuProperties -aALL    
    
    # 显示Raid卡型号,Raid设置,Disk相关信息
    $ megacli -cfgdsply -aALL    
    ## 磁带状态的变化,从拔盘,到插盘的过程中。
    Device           |Normal |Damage  |Rebuild |Normal
    Virtual Drive    |Optimal|Degraded|Degraded|Optimal
    Physical Drive   |Online |Failed Unconfigured|Rebuild|Online
    
    # 查看物理磁盘状态:
    $ megacli -PDRbld -ShowProg -PhysDrv  [Enclosure Device ID:Slot Number]  -a0
    ## Rebuild 中的物理磁盘状态中会显示:"Firmware state: Rebuild"
    
    # 查询 Rebuild 进度:
    $ megacli -pdrbld -showprog -physdrv[E:S] -aALL
    ## 返回内容类似于下面这样:
    Rebuild Progress on Device at Enclosure 32, Slot 5 Completed 77% in 101 Minutes.
    
    # 以文本进度条样式显示 Rebuild 进度:
    $ megacli -pdrbld -progdsply -physdrv[E:S] -aALL
    ## 屏幕显示类似下面的内容:
    Rebuild progress of physical drives...
    Enclosure:Slot               Percent Complete                       Time Elps
          032 :05   #######################87 %################*******  01:59:07 
    Press key to quit...
    
    # 查看 RAID 卡 Rebuild 参数:
    $ megacli -AdpAllinfo -aALL | grep -i rebuild
    ## 返回结果类似下面这样
    Rebuild Rate                     : 30%
    Auto Rebuild                     : Enabled
    Rebuild Rate                     : YesForce 
    Rebuild                    : Yes
    
    # 设置 RAID 卡 Rebuild 比例为60%:
    $ megacli -AdpSetProp { RebuildRate -60} -aALL
    ## 设置成功后返回:
    Adapter 0: Set rebuild rate to 60% success.
    

    MegaCLI使用方法:http://blog.51cto.com/daixuan/1863567

    重要参数

    参数名称含义
    Firmware state 磁盘状态
    Firmware state: Online, Spun Up 磁盘正常
    Firmware state: Unconfigured(good), Spun Up 磁盘已安装,但未启用
    Firmware state: Unconfigured(bad) 故障, 对应hwcheck的 Non-Critical
    Firmware state: Failed 故障, 对应hwcheck的Critical
    Firmware state: Rebuild 重建,一般在更换磁盘时显示
    Enclosure Device ID: 32 设备
    Slot Number: 1 磁盘在服务器上的槽位
    Adapter #0 适配器编号,对应 -a 参数

    实战:raid10环境下替换硬盘

    Raid10环境下换硬盘还是很简单的,支持热插拔,直接拔下换掉就可以了,下面是操作步骤。

    主要环境

    服务器: R720

    系统: CentOS7

    raid类型:raid10

    查看硬盘信息

    为了更加清楚的呈现操作过程,未对信息简化处理。

    $ MegaCli -PDList -aAll -NoLog
                                         
    Adapter #0
    
    Enclosure Device ID: 32
    Slot Number: 0
    Drive's postion: DiskGroup: 0, Span: 0, Arm: 0
    Enclosure position: 0
    Device Id: 0
    WWN: 5000C50076CD09B4
    Sequence Number: 1
    Media Error Count: 0
    Other Error Count: 0
    Predictive Failure Count: 28
    Last Predictive Failure Event Seq Number: 4378
    PD Type: SAS
    Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
    Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
    Coerced Size: 558.375 GB [0x45cc0000 Sectors]
    Firmware state: Unconfigured(good), Spun Up
    Device Firmware Level: ES66
    Shield Counter: 0
    Successful diagnostics completion on :  N/A
    SAS Address(0): 0x5000c50076cd09b5
    SAS Address(1): 0x0
    Connected Port Number: 5(path0) 
    Inquiry Data: SEAGATE ST3600057SS     ES666SL8SASQ            
    FDE Enable: Disable
    Secured: Unsecured
    Locked: Unlocked
    Needs EKM Attention: No
    Foreign State: Foreign 
    Foreign Secure: Drive is not secured by a foreign lock key
    Device Speed: 6.0Gb/s 
    Link Speed: 6.0Gb/s 
    Media Type: Hard Disk Device
    Drive Temperature :40C (104.00 F)
    PI Eligibility:  No 
    Drive is formatted for PI information:  No
    PI: No PI
    Drive's write cache : Disabled
    Port-0 :
    Port status: Active
    Port's Linkspeed: 6.0Gb/s 
    Port-1 :
    Port status: Active
    Port's Linkspeed: Unknown 
    Drive has flagged a S.M.A.R.T alert : Yes
    
    
    Enclosure Device ID: 32
    Slot Number: 2
    Enclosure position: 0
    Device Id: 2
    WWN: 5000C50076CD05BC
    Sequence Number: 2
    Media Error Count: 0
    Other Error Count: 0
    Predictive Failure Count: 0
    Last Predictive Failure Event Seq Number: 0
    PD Type: SAS
    Raw Size: 0 KB [0x0 Sectors]
    Non Coerced Size: 0 KB [0x0 Sectors]
    Coerced Size: 0 KB [0x0 Sectors]
    Firmware state: Unconfigured(bad)
    Device Firmware Level: ES66
    Shield Counter: 0
    Successful diagnostics completion on :  N/A
    SAS Address(0): 0x5000c50076cd05bd
    SAS Address(1): 0x0
    Connected Port Number: 1(path0) 
    Inquiry Data: SEAGATE ST3600057SS     ES666SL8SAVC            
    FDE Enable: Disable
    Secured: Unsecured
    Locked: Unlocked
    Needs EKM Attention: No
    Foreign State: None 
    Device Speed: Unknown 
    Link Speed: Unknown 
    Media Type: Hard Disk Device
    Drive:  Not Supported
    Drive Temperature :0C (32.00 F)
    PI Eligibility:  No 
    Drive is formatted for PI information:  No
    PI: No PI
    Drive's write cache : Disabled
    Port-0 :
    Port status: Active
    Port's Linkspeed: Unknown 
    Port-1 :
    Port status: Active
    Port's Linkspeed: Unknown 
    Drive has flagged a S.M.A.R.T alert : No
    
    
    Enclosure Device ID: 32
    Slot Number: 1
    Drive's postion: DiskGroup: 0, Span: 0, Arm: 1
    Enclosure position: 0
    Device Id: 1
    WWN: 5000C500983873BC
    Sequence Number: 2
    Media Error Count: 0
    Other Error Count: 0
    Predictive Failure Count: 0
    Last Predictive Failure Event Seq Number: 0
    PD Type: SAS
    Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
    Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
    Coerced Size: 558.375 GB [0x45cc0000 Sectors]
    Firmware state: Online, Spun Up
    Device Firmware Level: VT31
    Shield Counter: 0
    Successful diagnostics completion on :  N/A
    SAS Address(0): 0x5000c500983873bd
    SAS Address(1): 0x0
    Connected Port Number: 3(path0) 
    Inquiry Data: SEAGATE ST600MP0005     VT31S7M1CSLT            
    FDE Enable: Disable
    Secured: Unsecured
    Locked: Unlocked
    Needs EKM Attention: No
    Foreign State: None 
    Device Speed: Unknown 
    Link Speed: 6.0Gb/s 
    Media Type: Hard Disk Device
    Drive Temperature :41C (105.80 F)
    PI Eligibility:  No 
    Drive is formatted for PI information:  No
    PI: No PI
    Drive's write cache : Disabled
    Port-0 :
    Port status: Active
    Port's Linkspeed: 6.0Gb/s 
    Port-1 :
    Port status: Active
    Port's Linkspeed: Unknown 
    Drive has flagged a S.M.A.R.T alert : No
    
    
    Enclosure Device ID: 32
    Slot Number: 3
    Drive's postion: DiskGroup: 0, Span: 1, Arm: 1
    Enclosure position: 0
    Device Id: 3
    WWN: 5000C50076CE2F30
    Sequence Number: 2
    Media Error Count: 5
    Other Error Count: 71
    Predictive Failure Count: 15
    Last Predictive Failure Event Seq Number: 4379
    PD Type: SAS
    Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
    Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
    Coerced Size: 558.375 GB [0x45cc0000 Sectors]
    Firmware state: Online, Spun Up
    Device Firmware Level: ES66
    Shield Counter: 0
    Successful diagnostics completion on :  N/A
    SAS Address(0): 0x5000c50076ce2f31
    SAS Address(1): 0x0
    Connected Port Number: 2(path0) 
    Inquiry Data: SEAGATE ST3600057SS     ES666SL8SAKA            
    FDE Enable: Disable
    Secured: Unsecured
    Locked: Unlocked
    Needs EKM Attention: No
    Foreign State: None 
    Device Speed: 6.0Gb/s 
    Link Speed: 6.0Gb/s 
    Media Type: Hard Disk Device
    Drive Temperature :48C (118.40 F)
    PI Eligibility:  No 
    Drive is formatted for PI information:  No
    PI: No PI
    Drive's write cache : Disabled
    Port-0 :
    Port status: Active
    Port's Linkspeed: 6.0Gb/s 
    Port-1 :
    Port status: Active
    Port's Linkspeed: Unknown 
    Drive has flagged a S.M.A.R.T alert : Yes
    
    
    
    Enclosure Device ID: 32
    Slot Number: 4
    Drive's postion: DiskGroup: 1, Span: 0, Arm: 0
    Enclosure position: 0
    Device Id: 4
    WWN: 5000C5007E70F0F8
    Sequence Number: 2
    Media Error Count: 0
    Other Error Count: 0
    Predictive Failure Count: 0
    Last Predictive Failure Event Seq Number: 0
    PD Type: SAS
    Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
    Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
    Coerced Size: 558.375 GB [0x45cc0000 Sectors]
    Firmware state: Online, Spun Up
    Device Firmware Level: ES66
    Shield Counter: 0
    Successful diagnostics completion on :  N/A
    SAS Address(0): 0x5000c5007e70f0f9
    SAS Address(1): 0x0
    Connected Port Number: 0(path0) 
    Inquiry Data: SEAGATE ST3600057SS     ES666SL9F1JB            
    FDE Enable: Disable
    Secured: Unsecured
    Locked: Unlocked
    Needs EKM Attention: No
    Foreign State: None 
    Device Speed: 6.0Gb/s 
    Link Speed: 6.0Gb/s 
    Media Type: Hard Disk Device
    Drive Temperature :46C (114.80 F)
    PI Eligibility:  No 
    Drive is formatted for PI information:  No
    PI: No PI
    Drive's write cache : Disabled
    Port-0 :
    Port status: Active
    Port's Linkspeed: 6.0Gb/s 
    Port-1 :
    Port status: Active
    Port's Linkspeed: Unknown 
    Drive has flagged a S.M.A.R.T alert : No
    
    
    
    Enclosure Device ID: 32
    Slot Number: 5
    Drive's postion: DiskGroup: 1, Span: 0, Arm: 1
    Enclosure position: 0
    Device Id: 5
    WWN: 5000C5007E708E3C
    Sequence Number: 2
    Media Error Count: 0
    Other Error Count: 0
    Predictive Failure Count: 0
    Last Predictive Failure Event Seq Number: 0
    PD Type: SAS
    Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
    Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
    Coerced Size: 558.375 GB [0x45cc0000 Sectors]
    Firmware state: Online, Spun Up
    Device Firmware Level: ES66
    Shield Counter: 0
    Successful diagnostics completion on :  N/A
    SAS Address(0): 0x5000c5007e708e3d
    SAS Address(1): 0x0
    Connected Port Number: 4(path0) 
    Inquiry Data: SEAGATE ST3600057SS     ES666SL9F2RB            
    FDE Enable: Disable
    Secured: Unsecured
    Locked: Unlocked
    Needs EKM Attention: No
    Foreign State: None 
    Device Speed: 6.0Gb/s 
    Link Speed: 6.0Gb/s 
    Media Type: Hard Disk Device
    Drive Temperature :45C (113.00 F)
    PI Eligibility:  No 
    Drive is formatted for PI information:  No
    PI: No PI
    Drive's write cache : Disabled
    Port-0 :
    Port status: Active
    Port's Linkspeed: 6.0Gb/s 
    Port-1 :
    Port status: Active
    Port's Linkspeed: Unknown 
    Drive has flagged a S.M.A.R.T alert : No
    
    Exit Code: 0x00
    

    由以上信息可知该服务器有6块磁盘(Device Id)。

    卸载故障硬盘

    $ MegaCli -PDOffline -PhysDrv[32:2] -a0
    $ MegaCli -PDOffline -PhysDrv[32:0] -a0
    

    上面命令中 32 和 2 以及 -a0 的对应关系:

    Adapter #0
    Enclosure Device ID: 32
    Slot Number: 2
    

    替换故障硬盘

    此时故障硬盘已经OFFLINE,在服务器现场查看时,故障硬盘闪烁的是黄灯,正常硬盘的绿灯; 拔下故障硬盘,插上好硬盘,硬盘灯闪烁为绿色,并硬盘快速旋转,表示硬盘正在rebuild状态,查看状态如下:

    $ MegaCli -PDList -aAll -NoLog
    ...
    Enclosure Device ID: 32
    Slot Number: 3
    ...
    Firmware state: Rebuild
    ...
    

    查看rebuild进度

    $ MegaCli -PDRbld -ShowProg -PhysDrv[32:2] -aAll
    
    Rebuild Progress on Device at Enclosure 32, Slot 3 Completed 16% in 94 Minutes.
    

    磁盘更换完成

    $ MegaCli -PDList -aAll -NoLog | grep 'Firmware state'
    Firmware state: Online, Spun Up
    Firmware state: Online, Spun Up
    Firmware state: Online, Spun Up
    Firmware state: Online, Spun Up
    Firmware state: Online, Spun Up
    Firmware state: Online, Spun Up

  • 相关阅读:
    基本的Web控件二
    基本的Web控件一
    centos更改默认语言
    nginx优化配置
    使用nginx的proxy_cache做网站缓存
    centos7配置笔记
    redis批量删除
    Linq常用操作
    MVC ViewData和ViewBag[转]
    Transact-SQL的除法问题
  • 原文地址:https://www.cnblogs.com/xiaodoujiaohome/p/11729197.html
Copyright © 2011-2022 走看看