zoukankan      html  css  js  c++  java
  • NetApp存储方案及巡检命令

    一、MCC概述

    Clustered Metro Cluster(简称MCC)是Netapp Data Ontap提供的存储双活解决方案,当初的方案是把1个FAS/ V系列双控在数据中心之间拉远形成异地HA Pair,每站点只有单控制器节点,数据中心两站点之间通过额外的FC/VI集群适配器相连,数据中心间SAS磁盘框通过SAS转FC的FibreBridge相连在500米以内、同一个机房采用直接光纤通道交换机连接;在500米以上(最远100km)采用光纤通道和DWDM交换机相连。

    640?wx_fmt=png&wxfrom=5&wx_lazy=1

    0?wx_fmt=png

          MetroCluster在此架构上也进行了演变。通过在站点A、B两个站点分别放置两套FAS/ V双控阵列,阵列A的A控和阵列B的A控,阵列A的B控和阵列B的B控分别形成集群,这样可以充分把A、B站点数据中心资源充分利用,同时对外提供存储服务;但阵列内的A、B不是集群。如果站点间形成集群Pair的任意一个控制器节点故障,故障站点的主机都需要远程访问远端控制器节点;如何站点间形成集群Pair的两个节点同时故障,就会发生业务中断。

          Netapp Data Ontap8.3版本推出了4控双活解决方案,最远支持200公里距离,4控Metro Cluster方案首先由2个HA Pair组成2个本地集群,然后再从2个集群上做4节点集群。集群控制器之间内存日志通过存放在NVRAM里面,NVRAM对没有下盘的日志做了镜像,保证节点故障以后,HA Pair集群的Partner节点能够接管业务;或者站点故障以后,远端HA Pair集群能够接管业务。当日志到达一定水位或者发生系统操作刷盘时,下盘数据同步通过SyncMirror实现主从站点双写,从而确保一个站点磁盘故障以后,另外一个站点磁盘还能提供系统访问,实现站点故障切换,保证业务不中断。

    0?wx_fmt=png

          MetroCluster使用两个不同地点的镜像和集群来保护数据,每个集群把数据和Storage Virtual Machine (SVM) 配置都镜像同步另一个集群。当某个站点发生灾难时,管理员可以激活远端SVM并在另一站点接管业务。此外,每个集群在本地节点均配置为HA Pair,从而提供了本地故障转移能力。

    0?wx_fmt=png

          NetApp MetroCluster是以NetApp SyncMirror是配合Cluster_remote和控制器Cluster Failover的功能实现的。

        • Clustered Failover – 在主存储和容灾存储间提供高可用性失败恢复能力,故障接管的决策是由管理员通过单一命令行决定的。

        • SyncMirror – 为远端存储提供即时的数据拷贝,当故障接管时,数据可以仅通过远端的存储进行访问。

        • ClusterRemote – 提供管理机制用以判断灾难的发生并初始远端存储进行接管。

    二、MCC巡检常用命令

    1、系统健康状态检查

    cluster1::> system health status show
    Status
    ---------------
    ok

    2、集群状态检查

    cluster1::> cluster show              
    Node                  Health  Eligibility
    --------------------- ------- ------------
    cluster1-01           true    true
    cluster1-02           true    true
    2 entries were displayed.

    3、集群统计状态检查

    cluster1::> cluster statistics show
             Counter             Value         Delta
    ---------------- ----------------- -------------
           CPU Busy:                0%             -
         Operations:
              Total:                 0             -
                NFS:                 0             -
               CIFS:                 0             -
       Data Network:
               Busy:                0%             -
           Received:            5.78GB             -
               Sent:            13.7GB             -
    Cluster Network:
               Busy:                0%             -
           Received:             967KB             -
               Sent:             979KB             -
       Storage Disk:
               Read:            6.38PB             -
              Write:            6.26PB             -

    4、查看RAID组信息

    cluster1::> aggr show
                                                                          
    
    Aggregate     Size Available Used% State   #Vols  Nodes            RAID Status
    --------- -------- --------- ----- ------- ------ ---------------- ------------
    aggr0_A1   953.8GB   247.3GB   74% online       1 cluster1-01      raid4,
                                                                       mirrored,
                                                                       normal
    aggr0_A2   953.8GB   247.3GB   74% online       1 cluster1-02      raid4,
                                                                       mirrored,
                                                                       normal
    aggr_data_A1 
               68.93TB   16.04TB   77% online      32 cluster1-01      mixed_raid_
                                                                       type,
                                                                       mirrored,
                                                                       hybrid,
                                                                       normal
    aggr_data_A2 
               68.93TB   14.77TB   79% online      31 cluster1-02      mixed_raid_
                                                                       type,
                                                                       mirrored,
                                                                       hybrid,
                                                                       normal
    4 entries were displayed.

    5、查看节点信息

    cluster1::> node show
    Node      Health Eligibility Uptime        Model       Owner    Location  
    --------- ------ ----------- ------------- ----------- -------- ---------------
    cluster1-01 
              true   true        
                                369 days 19:12 FAS8040              gz_idc
    cluster1-02 
              true   true        
                                369 days 19:23 FAS8040              gz_idc
    2 entries were displayed.

    6、查看版本信息

    cluster1::> version
    NetApp Release 8.3.2P9: Fri Jan 06 05:54:05 UTC 2017

    7、查看序列号

    cluster1::> system license show
    
    Serial Number: 1-80-023992
    Owner: cluster1
    Package           Type    Description           Expiration
    ----------------- ------- --------------------- --------------------
    Base              license Cluster Base License  -
    
    Serial Number: 1-81-0000000000000451515******
    Package           Type    Description           Expiration
    ----------------- ------- --------------------- --------------------
    NFS               license NFS License           -
    iSCSI             license iSCSI License         -
    
    Serial Number: 1-81-0000000000000451515******
    Owner: cluster1-02
    Package           Type    Description           Expiration
    ----------------- ------- --------------------- --------------------
    NFS               license NFS License           -
    iSCSI             license iSCSI License         -
    5 entries were displayed.

    8、查看子系统健康状态

    cluster1::> system health subsystem show
    Subsystem         Health
    ----------------- ------------------
    SAS-connect       ok
    Environment       ok
    Memory            ok
    Service-Processor ok
    Switch-Health     ok
    CIFS-NDO          ok
    Motherboard       ok
    IO                ok
    MetroCluster      ok
    MetroCluster_Node ok
    FHM-Switch        ok
    FHM-Bridge        ok
    12 entries were displayed.

    9、查看MCC集群信息状态及节点信息状态

    cluster1::> metrocluster show
    
    Configuration: fabric
    
    Cluster                        Configuration State    Mode
    ------------------------------ ---------------------- ------------------------
     Local: cluster1               configured             normal
    Remote: cluster1_dr            configured             normal
    
    cluster1::> metrocluster node show
    DR                               Configuration  DR
    Group Cluster Node               State          Mirroring Mode
    ----- ------- ------------------ -------------- --------- --------------------
    1     cluster1
                  cluster1-01        configured     enabled   normal
                  cluster1-02        configured     enabled   normal
          cluster1_dr
                  cluster1_dr-01     configured     enabled   normal
                  cluster1_dr-02     configured     enabled   normal
    4 entries were displayed.

    10、查看控制器状态

    cluster1::> system controller show
    Controller Name           System ID     Serial Number     Model    Status      
    ------------------------- ------------- ----------------- -------- ----------- 
    cluster1-01               536964819     451515******      FAS8040  ok
    cluster1-02               536961600     451515******      FAS8040  ok
    2 entries were displayed.

    11、查看故障硬盘

    cluster1::> storage disk show -broken 
    There are no entries matching your query.

    12、查看spare硬盘

    cluster1::> storage disk show -spare  
    Original Owner: cluster1-01                                           
      Checksum Compatibility: block
                                                                Usable Physical
        Disk            HA Shelf Bay Chan   Pool  Type    RPM     Size     Size Owner
        --------------- ------------ ---- ------ ----- ------ -------- -------- --------
        1.30.11         3a    30  11    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
        1.30.13         3a    30  13    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
        1.31.4          3a    31   4    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
        1.32.20         4b    32  20    B  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
        1.32.23         3a    32  23    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
        1.33.0          3a    33   0    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
        1.33.1          3a    33   1    A  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
        1.33.10         4b    33  10    B  Pool0   SAS  10000   1.09TB   1.09TB cluster1-01
        2.42.22         3a    42  22    A  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
        2.42.23         4b    42  23    B  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
        2.43.2          4b    43   2    B  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
        2.43.22         3b    43  22    A  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
        2.43.23         4b    43  23    B  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
        3.11.21         4b    11  21    B  Pool0   SSD      -  372.4GB  372.6GB cluster1-01
        4.20.21         3a    20  21    A  Pool1   SSD      -  372.4GB  372.6GB cluster1-01
        4.21.14         3a    21  14    A  Pool1   SAS  10000   1.09TB   1.09TB cluster1-01
    Original Owner: cluster1-02
      Checksum Compatibility: block
                                                                Usable Physical
        Disk            HA Shelf Bay Chan   Pool  Type    RPM     Size     Size Owner
        --------------- ------------ ---- ------ ----- ------ -------- -------- --------
        2.44.23         3b    44  23    A  Pool1   SAS  10000   1.09TB   1.09TB cluster1-02
        3.12.21         4a    12  21    B  Pool0   SSD      -  372.4GB  372.6GB cluster1-02
        4.23.21         3b    23  21    A  Pool1   SSD      -  372.4GB  372.6GB cluster1-02
        5.60.23         3b    60  23    B  Pool1   SAS  10000   1.09TB   1.09TB cluster1-02
    20 entries were displayed.

    13、查看SAS桥故障

    cluster1::> storage bridge show
                                           Is        Monitor
    Bridge                   Symbolic Name Monitored Status  Vendor Model                 Bridge WWN
    ------------------------ ------------- --------- ------- ------ --------------------- ----------------
    ATTO_10.0.15.17          BRIDGE_B_1
                                           true      ok      Atto   FibreBridge 6500N     2000001086627bc0
    ATTO_10.0.15.18          BRIDGE_B_2
                                           true      ok      Atto   FibreBridge 6500N     2000001086630f0e
    ATTO_10.0.15.19          BRIDGE_B_3
                                           true      ok      Atto   FibreBridge 6500N     2000001086630edc
    ATTO_10.0.15.20          BRIDGE_B_4
                                           true      ok      Atto   FibreBridge 6500N     2000001086630ed2
    ATTO_10.0.15.6           BRIDGE_A_1
                                           true      ok      Atto   FibreBridge 6500N     2000001086630eb4
    ATTO_10.0.15.7           BRIDGE_A_2
                                           true      ok      Atto   FibreBridge 6500N     2000001086630efa
    ATTO_10.0.15.8           BRIDGE_A_3
                                           true      ok      Atto   FibreBridge 6500N     2000001086630f18
    ATTO_10.0.15.9           BRIDGE_A_4
                                           true      ok      Atto   FibreBridge 6500N     2000001086630ef0
    ATTO_FibreBridge6500N_10 -
                                           false     -       Atto   FibreBridge6500N      200000108663e514
    ATTO_FibreBridge6500N_11 -
                                           false     -       Atto   FibreBridge6500N      200000108663e3f2
    ATTO_FibreBridge6500N_12 -
                                           false     -       Atto   FibreBridge6500N      200000108663e488
    ATTO_FibreBridge6500N_13 -
                                           false     -       Atto   FibreBridge6500N      20000010866114ec
    ATTO_FibreBridge6500N_14 -
                                           false     -       Atto   FibreBridge6500N      2000001086627bc0
    ATTO_FibreBridge6500N_7  -
                                           false     -       Atto   FibreBridge6500N      2000001086630e96
    ATTO_FibreBridge6500N_9  -
                                           false     -       Atto   FibreBridge6500N      200000108663e4c4
    15 entries were displayed.

    14、查看纤交换机故障

    cluster1::> storage switch show
                          Symbolic                                Is        Monitor
    Switch                Name     Vendor  Model Switch WWN       Monitored Status
    --------------------- -------- ------- ----- ---------------- --------- -------
    Brocade_10.0.15.10
                          SW_A_1
                                   Brocade Brocade6505
                                                 100050eb1a88327f true      ok
    Brocade_10.0.15.11
                          SW_A_2
                                   Brocade Brocade6505
                                                 100050eb1a881582 true      ok
    Brocade_10.0.15.21
                          SW_B_3
                                   Brocade Brocade6505
                                                 100050eb1a882f69 true      ok
    Brocade_10.0.15.22
                          SW_B_4
                                   Brocade Brocade6505
                                                 100050eb1a881522 true      ok
    4 entries were displayed.

    15、查看failover状态

    cluster1::> storage failover show 
                                  Takeover          
    Node           Partner        Possible State Description  
    -------------- -------------- -------- -------------------------------------
    cluster1-01    cluster1-02    true     Connected to cluster1-02
    cluster1-02    cluster1-01    true     Connected to cluster1-01
    2 entries were displayed.

    16、查看严重告警日志及错误告警日志

    cluster1::> event log show -severity critical 
    There are no entries matching your query.
    
    cluster1::> event log show -severity error
    Time                Node             Severity      Event
    ------------------- ---------------- ------------- ---------------------------
    3/6/2018 02:28:30   cluster1-02      ERROR         asup.post.drop: AutoSupport message (HA Group Notification from cluster1-02 (MANAGEMENT_LOG) INFO) for host (0) was not posted to NetApp. The system will drop the message.
    3/6/2018 01:28:18   cluster1-02      ERROR         asup.post.drop: AutoSupport message (HA Group Notification from cluster1-02 (PERFORMANCE DATA) INFO) for host (0) was not posted to NetApp. The system will drop the message.
    3/6/2018 00:00:07   cluster1-02      ERROR         mgmtgwd.certificate.expired: A digital certificate with Fully Qualified Domain Name (FQDN) cluster1, Serial Number 5589765F, Certificate Authority 'cluster1' and type server for Vserver cluster1 has expired.
    3/6/2018 00:00:07   cluster1-02      ERROR         mgmtgwd.certificate.expired: A digital certificate with Fully Qualified Domain Name (FQDN) UC_SVM2, Serial Number 55A03966, Certificate Authority 'SVM2' and type server for Vserver SVM2 has expired.
    3/6/2018 00:00:07   cluster1-02      ERROR         mgmtgwd.certificate.expired: A digital certificate with Fully Qualified Domain Name (FQDN) UC_SVM, Serial Number 559FFD76, Certificate Authority 'SVM' and type server for Vserver SVM has expired.
    3/6/2018 00:00:07   cluster1-02      ERROR         mgmtgwd.certificate.expired: A digital certificate with Fully Qualified Domain Name (FQDN) UCS_SVM_DR, Serial Number 545845C16E278, Certificate Authority 'SVM_DR' and type server for Vserver SVM_DR-mc has expired.
    3/6/2018 00:00:07   cluster1-02      ERROR         mgmtgwd.certificate.expired: A digital certificate with Fully Qualified Domain Name (FQDN) UCS_SVM2_DR, Serial Number 545845A7B01FA, Certificate Authority 'SVM2_DR' and type server for Vserver SVM2_DR-mc has expired.
    7 entries were displayed.

     17、查看某个聚合下的Volume状态信息
    cluster1::> vol show -aggregate aggr_data_A1

     18、查看Lun信息及Lun详细信息

    cluster1::> lun show
    cluster1::> lun show -v

     19、查看map信息及map详情

    cluster1::> igroup show
    cluster1::> igroup show -v

     20、查看Lun的map情况

    cluster1::> lun show -m

    21、进入某一节点

    cluster1::> run -node cluster1-01 
    Type 'exit' or 'Ctrl-D' to return to the CLI
    cluster1-01>

     22、节点下查看spare disks

    cluster1-01> vol status -s
    
    Local spares
    
    Pool1 spare disks
    
    RAID Disk       Device                  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
    ---------       ------                  ------------- ---- ---- ---- ----- --------------    --------------
    Spare disks for block checksum
    spare           SW_B_3:6.126L41         3a    21  14  FC:A   1   SAS 10000 1142352/2339537408 1144641/2344225968 (not zeroed)
    spare           SW_B_3:7.126L75         3a    42  22  FC:A   1   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_B_3:7.126L101        3b    43  22  FC:A   1   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_B_4:7.126L76         4b    42  23  FC:B   1   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_B_4:7.126L29         4b    43  2   FC:B   1   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_B_4:7.126L50         4b    43  23  FC:B   1   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_B_3:6.126L22         3a    20  21  FC:A   1   SSD   N/A 381304/780910592  381554/781422768 
    
    Pool0 spare disks
    
    RAID Disk       Device                  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
    ---------       ------                  ------------- ---- ---- ---- ----- --------------    --------------
    Spare disks for block checksum
    spare           SW_A_1:7.126L12         3a    30  11  FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_A_1:7.126L14         3a    30  13  FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_A_1:7.126L31         3a    31  4   FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_A_1:7.126L76         3a    32  23  FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_A_1:7.126L79         3a    33  0   FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_A_1:7.126L80         3a    33  1   FC:A   0   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_A_2:7.126L73         4b    32  20  FC:B   0   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_A_2:7.126L37         4b    33  10  FC:B   0   SAS 10000 1142352/2339537408 1144641/2344225968 
    spare           SW_A_2:6.126L74         4b    11  21  FC:B   0   SSD   N/A 381304/780910592  381554/781422768

     23、节点下查看fail disk

    cluster1-01> vol status -f
    
    Broken disks (empty)

     24、显示没有ownership(归属权)的硬盘

    cluster1-01> disk show -n
    
    disk show : No unassigned disks

     25、分配硬盘的归属(硬盘更换常用)

    cluster1-01> disk assign all 

      26、查看所有硬盘位置信息

    cluster1-01> storage show disk -p 
  • 相关阅读:
    AIMS 2013中的性能报告工具不能运行的解决办法
    读懂AIMS 2013中的性能分析报告
    在线研讨会网络视频讲座 方案设计利器Autodesk Infrastructure Modeler 2013
    Using New Profiling API to Analyze Performance of AIMS 2013
    Map 3D 2013 新功能和新API WebCast视频下载
    为Autodesk Infrastructure Map Server(AIMS) Mobile Viewer创建自定义控件
    ADN新开了云计算Cloud和移动计算Mobile相关技术的博客
    JavaScript修改css样式style
    文本编辑神器awk
    jquery 开发总结1
  • 原文地址:https://www.cnblogs.com/cloudos/p/8515574.html
Copyright © 2011-2022 走看看