zoukankan      html  css  js  c++  java
  • [RM HA 1] Cloudera CDH5 RM HA功能验证

    简介: 最新的Cloudera CDH5.0.0 beta版本已经支持RMHA, 笔者为此简单验证了RM HA的功能. 后续将继续分析其HA的原理,以及其与社区RM HA的区别.

    集群部属与RM failover功能性验证

    1. 硬件准备

      四台机器, bj1, bj3, bj4, bj5 准备好相应的环境(包括ssh互通, java环境).

      角色说明, bj1为rm1, bj3为rm2, bj4和bj4为slave.

      Zookeeper部属在bj1上.

    2. Hadoop版本准备http://archive.cloudera.com/cdh5/cdh/5/ 下载相应的CDH5版本hadoop-2.2.0-cdh5.0.0-beta-1.tar.gz(包括部属包和原代码),然后部属到每台slave中.
    3. Zookeeper安装在bj1, 下载最新Zookeeper, 解压后配置 conf/zoo.cfg文件, 然后启动.

      [yuling.sh@v125050024 ~]$ cd zookeeper-3.4.3/

      [yuling.sh@v125050024 zookeeper-3.4.3]$ cp conf/zoo_sample.cfg conf/zoo.cfg

      [yuling.sh@v125050024 zookeeper-3.4.3]$ bin/zkServer.sh start

    4.  

    5. 首先启动HDFS

      bin/hadoop namenode –format

      sbin/start-dfs.sh

      网页上查看Namenode:  http://bj1:50070/dfshealth.jsp

    6. 启动Yarn

      rm1上启动resourcemanager

      sbin/yarn-daemon.sh start resourcemanager

      rm2上启动resourcemanager

      sbin/yarn-daemon.sh start resourcemanager

       

      slave启动NodeManager

          sbin/yarn-daemons.sh start nodemanager

      查看rm1和mr2的网页. http://bj1:23188/clusterhttp://bj3:23188/cluster 其中active RM的网页可以查看, stanby的RM无法查看网页.

      注: 如果yarn.resourcemanager.ha.automatic-failover.enabled设置为false, 则需要手动设置其中一个RM为active,负责两个RM都为standby.

    7. 提交一个sleep作业测试

      bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-cdh5.0.0-beta-1.jar sleep -m 1000

      然后可以到网页上查看作业运行情况

    8. 在作业运行过程中kill掉active的RM进程, 这时候打开standby RM的网页,可以看到刚才提交的作业继续运行.

      [yuling.sh@v125050024 hadoop-2.2.0-cdh5.0.0-beta-1]$ jps

      31333 ResourceManager

      31671 Jps

      29502 NameNode

      25375 QuorumPeerMain

      [yuling.sh@v125050024 hadoop-2.2.0-cdh5.0.0-beta-1]$ kill 31333

    结论: 上述几步简单验证了Cloudera RM Auto Failover的功能。

    附录1

    配置参考

    1. etc/hadoop/slaves

      bj4

      bj5

    2. etc/hadoop/hdfs-site.xml

    <property>

    <name>fs.default.name</name>

    <value>hdfs://bj1:9000</value>

    </property>

    1. etc/hadoop/mapred-site.xml

      <property>

      <name>mapreduce.framework.name</name>

      <value>yarn</value>

      </property>

    2. etc/hadoop/yarn-site.xml配置如下

    除了yarn.resourcemanager.ha.id需要稍作修改外, 其它配置都可以一样.

    <!-- Resource Manager Configs -->

    <property>

    <name>yarn.resourcemanager.connect.retry-interval.ms</name>

    <value>2000</value>

    </property>

    <property>

    <name>yarn.resourcemanager.ha.enabled</name>

    <value>true</value>

    </property>

    <property>

    <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>

    <value>true</value>

    </property>

    <property>

    <name>yarn.resourcemanager.ha.rm-ids</name>

    <value>rm1,rm2</value>

    </property>

    <property>

    <name>yarn.resourcemanager.ha.id</name>

    <value>rm2</value> <!—注释, rm1上配置为rm1, rm2上配置rm2-->

    </property>

    <property>

    <name>yarn.resourcemanager.store.class</name>

    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>

    </property>

    <property>

    <name>yarn.resourcemanager.zk.state-store.address</name>

    <value>bj1:2181</value>

    </property>

    <property>

    <name>ha.zookeeper.quorum</name>

    <value>bj1:2181</value>

    </property>

    <property>

    <name>yarn.resourcemanager.recovery.enabled</name>

    <value>true</value>

    </property>

    <property>

    <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>

    <value>5000</value>

    </property>

    <!-- RM1 configs -->

    <property>

    <name>yarn.resourcemanager.address.rm1</name>

    <value>bj1:23140</value>

    </property>

    <property>

    <name>yarn.resourcemanager.scheduler.address.rm1</name>

    <value>bj1:23130</value>

    </property>

    <property>

    <name>yarn.resourcemanager.webapp.address.rm1</name>

    <value>bj1:23188</value>

    </property>

    <property>

    <name>yarn.resourcemanager.resource-tracker.address.rm1</name>

    <value>bj1:23125</value>

    </property>

    <property>

    <name>yarn.resourcemanager.admin.address.rm1</name>

    <value>bj1:23141</value>

    </property>

    <property>

    <name>yarn.resourcemanager.ha.admin.address.rm1</name>

    <value>bj1:23142</value>

    </property>

    <!-- RM2 configs -->

    <property>

    <name>yarn.resourcemanager.address.rm2</name>

    <value>bj3:23140</value>

    </property>

    <property>

    <name>yarn.resourcemanager.scheduler.address.rm2</name>

    <value>bj3:23130</value>

    </property>

    <property>

    <name>yarn.resourcemanager.webapp.address.rm2</name>

    <value>bj3:23188</value>

    </property>

    <property>

    <name>yarn.resourcemanager.resource-tracker.address.rm2</name>

    <value>bj3:23125</value>

    </property>

    <property>

    <name>yarn.resourcemanager.admin.address.rm2</name>

    <value>bj3:23141</value>

    </property>

    <property>

    <name>yarn.resourcemanager.ha.admin.address.rm2</name>

    <value>bj3:23142</value>

    </property>

    <!-- Node Manager Configs -->

    <property>

    <description>Address where the localizer IPC is.</description>

    <name>yarn.nodemanager.localizer.address</name>

    <value>0.0.0.0:23344</value>

    </property>

    <property>

    <description>NM Webapp address.</description>

    <name>yarn.nodemanager.webapp.address</name>

    <value>0.0.0.0:23999</value>

    </property>

    <property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce_shuffle</value>

    </property>

    <property>

    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

    <value>org.apache.hadoop.mapred.ShuffleHandler</value>

    </property>

    <property>

    <name>yarn.nodemanager.local-dirs</name>

    <value>/tmp/pseudo-dist/yarn/local</value>

    </property>

    <property>

    <name>yarn.nodemanager.log-dirs</name>

    <value>/tmp/pseudo-dist/yarn/log</value>

    </property>

    <property>

    <name>mapreduce.shuffle.port</name>

    <value>23080</value>

    </property>

  • 相关阅读:
    Kruskal
    克鲁斯卡尔
    克鲁斯卡尔
    实践是检验真理的唯一标准 脱壳篇02
    Kruskal
    克鲁斯卡尔算法讲解
    实践是检验真理的唯一标准 脱壳篇02
    最小生成树(普里姆算法) 数据结构和算法62
    克鲁斯卡尔算法讲解
    最小生成树(普里姆算法) 数据结构和算法62
  • 原文地址:https://www.cnblogs.com/shenh062326/p/3529267.html
Copyright © 2011-2022 走看看