zoukankan      html  css  js  c++  java
  • 3.16 使用Zookeeper对HDFS HA配置自动故障转移及测试

    一、说明

    从上一节可看出,虽然搭建好了HA架构,但是只能手动进行active与standby的切换;

    接下来看一下用zookeeper进行自动故障转移:

    #
    在启动HA之后,两个NameNode都是standby状态,可以利用zookeeper的选举功能,选出一个当Active
    
    #
    监控
        ZKFC
        FailoverController


    image


    二、配置

    1、hdfs-site.xml

    #”开启自动转移功能“,加入以下内容;

    <property>
            <name>dfs.ha.automatic-failover.enabled</name>
            <value>true</value>
    </property>


    2、core-site.xml

    #”设置故障转移的zookeeper集群“,加入以下内容;

    <property>
            <name>ha.zookeeper.quorum</name>
            <value>master:2181,slave1:2181,slave2:2181</value>
    </property>


    3、关闭集群所有服务

    #master
    [root@master hadoop-2.5.0]# sbin/stop-dfs.sh
    
    [root@master ~]# xcall jps
    ====== master jps ======
    18719 Jps
    ====== slave1 jps ======
    19150 Jps
    ====== slave2 jps ======
    13595 Jps
    
    #如果还有其他服务(zookeeper等)也要关闭;


    4、同步配置文件

    [root@master hadoop]# pwd
    /opt/app/hadoop-2.5.0/etc/hadoop
    
    [root@master hadoop]# scp -r hdfs-site.xml core-site.xml root@slave1:/opt/app/hadoop-2.5.0/etc/hadoop/
      
    [root@master hadoop]# scp -r hdfs-site.xml core-site.xml root@slave2:/opt/app/hadoop-2.5.0/etc/hadoop/


    5、启动zookeeper

    #所有节点启动zookeeper
    [root@master ~]# /opt/app/zookeeper-3.4.5/bin/zkServer.sh start
    
    [root@slave1 ~]# /opt/app/zookeeper-3.4.5/bin/zkServer.sh start
    
    [root@slave2 ~]# /opt/app/zookeeper-3.4.5/bin/zkServer.sh start
    
    #查看
    [root@master ~]# xcall jps
    ====== master jps ======
    18824 Jps
    18765 QuorumPeerMain
    ====== slave1 jps ======
    19201 QuorumPeerMain
    19263 Jps
    ====== slave2 jps ======
    13646 QuorumPeerMain
    13702 Jps


    6、初始化HA在Zookeeper中状态

    #master
    [root@master hadoop-2.5.0]# bin/hdfs zkfc -formatZK
    
    #
    此时可以在slave1上用客户端连入zookeeper查看:
    [root@slave1 zookeeper-3.4.5]# bin/zkCli.sh
    
    [zk: localhost:2181(CONNECTED) 1] ls /
    [zookeeper]
    
    [zk: localhost:2181(CONNECTED) 2] ls /    #生成了hadoop-ha
    [hadoop-ha, zookeeper]


    7、启动HDFS服务

    #master
    [root@master hadoop-2.5.0]# sbin/start-dfs.sh 
    
    #查看启动情况
    [root@master ~]# xcall jps
    ====== master jps ======
    19588 DFSZKFailoverController    #ZKFC监控进程
    19087 NameNode
    19193 DataNode
    19393 JournalNode
    18765 QuorumPeerMain
    19662 Jps
    ====== slave1 jps ======
    19743 DFSZKFailoverController    #ZKFC监控进程
    19201 QuorumPeerMain
    19800 Jps
    19613 JournalNode
    19521 DataNode
    19443 NameNode
    ====== slave2 jps ======
    13646 QuorumPeerMain
    13850 DataNode
    14014 Jps
    13942 JournalNode
    
    
    #查看nn1  nn2的状态
    [root@master hadoop-2.5.0]# bin/hdfs haadmin -getServiceState nn1
    19/04/18 10:34:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    active
    
    [root@master hadoop-2.5.0]# bin/hdfs haadmin -getServiceState nn2
    19/04/18 10:34:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    standby
    
    #可见已经自动把nn1选举为active了,nn2为standby;在web中也可以看到;


    8、测试故障自动转移

    可以kill掉active状态的namenode,查看standby状态的namenode是否已经自动变为active了;

  • 相关阅读:
    lamp
    mysql多实例部署
    mysql进阶
    rsync
    mysql基础
    httpd
    ftp
    高级命令之awk
    NFS
    网络进阶管理
  • 原文地址:https://www.cnblogs.com/weiyiming007/p/10728237.html
Copyright © 2011-2022 走看看