zoukankan      html  css  js  c++  java
  • hadoop HA 配置 + zookeeper 服务注册

    环境测试 6台机器 centos6.7 x64

    master ( namenode/cluster )
    10.10.100.101    namenode1
    10.10.100.105    namenode2
    ResourceManager
    manager 
    datanode (datanode,NodeManager,JournalNnode,QuprumPeerMain)
    10.10.100.102 datanode1 + zk1
    10.10.100.103 datanode2 + zk2
    10.10.100.104 datanode3 + zk3

    节点服务部署:

    节点服务创建 
    zookeeper/data/目录下建立myid文件
    datanode1为1 
    datanode2为2
    datanode3为3

     配置文件:

    配置core-site.xml

    需要配置nameservice,hadoop 文件存储位置和Zookeeper集群来确保多个namenode之间的切换,修改后内容如下:

    <configuration>
    
    <!-- 指定hdfs的nameservice为master -->
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://master</value>
        </property>
    
    <!-- 指定hadoop临时目录 -->
        <property>
            <name>hadoop.tmp.dir</name>
            <value>file:/opt/hadoop/tmp</value>
        </property>
    
    <!-- 指定zookeeper地址 -->
        <property>
            <name>ha.zookeeper.quorum</name>
            <value>datanode1:2181,datanode2:2181,datanode3:2181</value>
        </property>
    
        <property>   
            <name>ha.zookeeper.session-timeout.ms</name>   
            <value>300000</value>   
        </property>
    
    </configuration>

     配置hdfs-site.xml

    hdfs-site.xml主要配置namenode的高可用;

    <configuration>
    <!--指定hdfs的nameservice为master,需要和core-site.xml中的保持一致 -->
        <property>
            <name>dfs.nameservices</name>
            <value>master</value>
        </property>
    <!-- master下面有两个NameNode,分别是namenode1,namenode2 -->
        <property>
            <name>dfs.ha.namenodes.master</name>
            <value>namenode1,namenode2</value>
        </property>
    <!-- namenode1 的 RPC 通信地址 -->
    <property>
            <name>dfs.namenode.rpc-address.master.namenode1</name>
            <value>namenode1:9000</value>
        </property>
    <!-- namenode1 的http通信地址 -->
        <property>
            <name>dfs.namenode.http-address.master.namenode1</name>
            <value>namenode1:50070</value>
        </property>
    <!-- namenode2 的 RPC 通信地址 -->
        <property>
            <name>dfs.namenode.rpc-address.master.namenode2</name>
            <value>namenode2:9000</value>
        </property>
    <!-- namenode2 的http通信地址 -->
        <property>
            <name>dfs.namenode.http-address.master.namenode2</name>
            <value>namenode2:50070</value>
        </property>
    <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
        <property>
            <name>dfs.namenode.shared.edits.dir</name>
            <value>qjournal://datanode1:8485;datanode2:8485;datanode3:8485/master</value>
        </property>
    <!-- 指定JournalNode在本地磁盘存放数据的位置 (工作目录)-->
        <property>
            <name>dfs.journalnode.edits.dir</name>
            <value>/opt/hadoop/journal</value>
        </property>
    <!-- 开启NameNode失败自动切换 -->
        <property>
            <name>dfs.ha.automatic-failover.enabled</name>
            <value>true</value>
        </property>
    <!-- 配置失败自动切换实现方式 -->
        <property>
            <name>dfs.client.failover.proxy.provider.master</name>
            <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        </property>
    <!-- 配置sshfence隔离机制 -->
        <property>
            <name>dfs.ha.fencing.methods</name>
            <value>sshfence</value>
        </property>
    <!-- 使用隔离机制时需要ssh免登陆 -->
    
        <property>
            <name>dfs.ha.fencing.ssh.private-key-files</name>
            <value>/root/.ssh/id_rsa</value>
        </property>
    
    
        <property>   
            <name>dfs.namenode.name.dir</name>     
            <value>/opt/hadoop/hdfs/name</value>   
        </property>   
    
        <property>   
            <name>dfs.datanode.data.dir</name>   
            <value>/opt/hadoop/hdfs/data</value>   
        </property> 
    
        <property>   
            <name>dfs.replication</name>   
            <value>3</value>   
        </property>   
    
        <property>   
            <name>dfs.webhdfs.enabled</name>   
            <value>true</value>   
        </property>
    
    
    </configuration>

     配置mapreduce文件mapred-site.xml

    默认是没有mapred-site.xml文件的,里面有一个mapred-site.xml.example,重命名为mapred-site.xml
    mv mapred-site.xml.example mapred-site.xml
    配置内容如下,这里就是指明mapreduce是用在YARN之上来执行的。

    <configuration>
    <!-- 指定mr框架为yarn方式 -->
        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
    
        <property> 
            <name>mapreduce.job.maps</name> 
            <value>12</value> 
        </property> 
    
        <property> 
            <name>mapreduce.job.reduces</name> 
            <value>12</value> 
        </property> 
    
    </configuration>

     配置yarn-site.xml

    做规划的时候就是配置hadoop03来运行yarn,配置如下:

    <configuration>
    <!-- 指定resourcemanager地址 -->
        <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>manager</value>
        </property>
    
    <!-- 指定nodemanager启动时加载server的方式为shuffle server -->
        <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
    
    
    <!-- 
        
      <property>   
            <name>yarn.log-aggregation-enable</name>   
            <value>true</value>   
        </property>   
    
        <property>   
            <name>yarn.log-aggregation.retain-seconds</name>   
            <value>259200</value>   
        </property>   
    
        <property>   
            <name>yarn.resourcemanager.zk-address</name>   
            <value>datanode1:2181,datanode3:2181,datanode2:2181</value>   
        </property>   
    
        <property>   
            <name>yarn.resourcemanager.cluster-id</name>   
            <value>cluster-yarn</value>   
        </property>   
    
        <property>   
            <name>yarn.resourcemanager.ha.enabled</name>   
            <value>true</value>   
        </property>   
    
        <property>   
            <name>yarn.resourcemanager.ha.master</name>   
            <value>namenode1,namenode2</value>   
        </property>   
    
        <property>   
            <name>yarn.resourcemanager.hostname.namenode1</name>   
            <value>namenode1</value>   
        </property>       
    
        <property>   
            <name>yarn.resourcemanager.hostname.namenode2</name>   
            <value>namenode2</value>   
        </property>   
    
    <property>
           <name>yarn.resourcemanager.scheduler.address.namenode1</name>
            <value>namenode1:8030</value>
        </property>
    
        <property>
           <name>yarn.resourcemanager.scheduler.address.namenode2</name>
           <value>namenode2:8030</value>
        </property>
    
        <property>
            <name>yarn.resourcemanager.resource-tracker.address.namenode1</name>
            <value>namenode1:8031</value>
        </property>
    
        <property>
           <name>yarn.resourcemanager.resource-tracker.address.namenode2</name>
           <value>namenode2:8031</value>
        </property>
    
        <property>
           <name>yarn.resourcemanager.address.namenode1</name>
           <value>namenode1:8032</value>
        </property>
    
        <property>
           <name>yarn.resourcemanager.address.namenode2</name>
           <value>namenode2:8032</value>
        </property>
    
        <property>
           <name>yarn.resourcemanager.admin.address.namenode1</name>
           <value>namenode:8033</value>
        </property>
    
        <property>
            <name>yarn.resourcemanager.admin.address.namenode2</name>
            <value>namenode2:8033</value>
        </property>
    
        <property>
           <name>yarn.resourcemanager.webapp.address.namenode1</name>
           <value>namenode1:8088</value>
        </property>
    
        <property>
           <name>yarn.resourcemanager.webapp.address.namenode2</name>
           <value>namenode2:8088</value>
        </property> 
    
        <property> 
            <name>yarn.resourcemanager.ha.automatic-failover.enabled</name> 
            <value>true</value> 
        </property> 
    
        <property> 
            <name>yarn.resourcemanager.ha.automatic-failover.embedded</name> 
            <value>true</value> 
        </property> 
    
        <property> 
            <name>yarn.resourcemanager.ha.automatic-failover.zk-base-path</name> 
            <value>/yarn-leader-election</value> 
        </property> 
    
        -->
    
        <property>   
            <name>yarn.resourcemanager.recovery.enabled</name>   
            <value>true</value>   
        </property>   
    
        <property> 
            <name>yarn.resourcemanager.store.class</name> 
            <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> 
        </property> 
    
        <property>   
            <name>yarn.nodemanager.aux-services</name>   
            <value>mapreduce_shuffle</value>   
        </property>   
    
        <property>   
            <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>   
            <value>org.apache.hadoop.mapred.ShuffleHandler</value>   
        </property>       
    
    </configuration>

    在hadoop master 上启动journalnode

    [root@hadoop01 hadoop-2.7.1]# sbin/hadoop-daemons.sh start journalnode
    journalnode的节点是datanode1、datanode2、datanode3,这三台机器上会出现JournalNode
    [root@hadoop04 zookeeper-3.4.6]# jps
    1532 JournalNode
    1796 Jps
    1470 QuorumPeerMain
    在hadoop上格式化hadoop
    
    hadoop namenode -format
    
    hadoop 上格式化 zk
    hdfs zkfc -formatZK
    
    hdfs zkfc –formatZK
    --格式化 zookeeper custer-ha主目录
  • 相关阅读:
    CBV进阶(一)
    uva 11748 Rigging Elections
    uva 11573 Ocean Currents(bfs+优先队列)
    无向图的欧拉路
    poj 3254 Corn Fields
    hdu 1114
    hdu 2639 (第k小的01背包)
    uva 1347 tour
    uva 437 The Tower of Babylon
    uva 1025 A Spy in the Metro(动态规划)
  • 原文地址:https://www.cnblogs.com/sharesdk/p/8992152.html
Copyright © 2011-2022 走看看