zoukankan      html  css  js  c++  java
  • hadoop-2.3.0-cdh5.1.0完全分布式集群配置HA配置

    一、安装前准备:
    操作系统:CentOS 6.5 64位操作系统
    环境:jdk1.7.0_45以上,本次采用jdk-7u55-linux-x64.tar.gz
    master01 10.10.2.57 namenode 节点
    master02 10.10.2.58 namenode 节点
    slave01:10.10.2.173 datanode 节点
    slave02:10.10.2.59 datanode 节点
    slave03: 10.10.2.60 datanode 节点
    注:Hadoop2.0以上采用的是jdk环境是1.7,Linux自带的jdk卸载掉,重新安装
    下载地址:http://www.oracle.com/technetwork/java/javase/downloads/index.html
    软件版本:hadoop-2.3.0-cdh5.1.0.tar.gz, zookeeper-3.4.5-cdh5.1.0.tar.gz
    下载地址:http://archive.cloudera.com/cdh5/cdh/5/
    开始安装:
    二、jdk安装
    1、检查是否自带jdk
    rpm -qa | grep jdk
    java-1.6.0-openjdk-1.6.0.0-1.45.1.11.1.el6.i686 
    2、卸载自带jdk
    yum -y remove java-1.6.0-openjdk-1.6.0.0-1.45.1.11.1.el6.i686
    3、安装jdk-7u55-linux-x64.tar.gz
    在usr/目录下创建文件夹java,在java文件夹下运行tar –zxvf jdk-7u55-linux-x64.tar.gz
    解压到java目录下
    [root@master01 java]# ls
    jdk1.7.0_55
    三、配置环境变量
    远行vi /etc/profile
    # /etc/profile
    # System wide environment and startup programs, for login setup
    # Functions and aliases go in /etc/bashrc
    export JAVA_HOME=/usr/java/jdk1.7.0_55
    export JRE_HOME=/usr/java/jdk1.7.0_55/jre
    export CLASSPATH=/usr/java/jdk1.7.0_55/lib
    export PATH=$JAVA_HOME/bin: $PATH
    保存修改,运行source /etc/profile 重新加载环境变量
    运行java -version
    [root@master01 java]# java -version
    java version "1.7.0_55"
    Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)
    Jdk配置成功
    四、系统配置
    预先准备5台机器,并配置IP
    关闭防火墙
    chkconfig iptables off(永久性关闭)
    配置主机名和hosts文件
    [root@master01 java]# vi /etc/hosts
    127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
    10.10.2.57 master01
    10.10.2.58 master02
    10.10.2.173 slave01
    10.10.2.59 slave02
    10.10.2.60 slave03
    按照不同机器IP配置不同的主机名
    3、SSH无密码验证配置
    因为Hadoop运行过程需要远程管理Hadoop的守护进程,NameNode节点需要通过SSH(Secure Shell)链接各个DataNode节点,停止或启动他们的进程,所以SSH必须是没有密码的,所以我们要把NameNode节点和DataNode节点配制成无秘密通信,同理DataNode也需要配置无密码链接NameNode节点。
    在每一台机器上配置:
    vi /etc/ssh/sshd_config打开
    RSAAuthentication yes # 启用 RSA 认证,PubkeyAuthentication yes # 启用公钥私钥配对认证方式
    Master01:运行:ssh-keygen -t rsa -P ""  不输入密码直接enter
    默认存放在 /root/.ssh目录下,
    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    [root@master01 .ssh]# ls
    authorized_keys  id_rsa  id_rsa.pub  known_hosts
    slave01执行相同的操作,然后将master01 /root/.ssh/目录下的id_rsa.pub放到 slave01 相同目录下的authorized_keys这样slave01就持有了master01的公钥 然后直接ssh slave01测试是否可以无密码连接到slave01上,然后将slave01 上的id_rsa.pub 追加到master01的authorized_keys中,测试ssh master01 是否可以直接连上slave01.
    [root@master01 ~]# ssh slave01
    Last login: Tue Aug 19 14:28:15 2014 from master01
    [root@slave01 ~]# 
    Master01-master02
    Master01-slave01
    Master01-slave02
    Master01-slave03
    Master02-slave01
    Master02-slave02
    Master02-slave03
    执行相同的操作。
      
    五、安装Hadoop
    建立文件目录 /usr/local/cloud 创建文件夹data,存放数据、日志文件,haooop原文件,zookeeper原文件
    [root@slave01 cloud]# ls
    data  hadoop  tar  zookeeper
    5.1、配置hadoop-env.sh
    进入到/usr/local/cloud/hadoop/etc/hadoop目录下
    配置vi hadoop-env.sh hadoop运行环境加载
    export JAVA_HOME=/usr/java/jdk1.7.0_55
    5.2、配置core-site.xml
    <!—hadoop.tmp.dir:hadoop很多路径都依赖他,namenode节点该目录不可以删除,否则需要重新格式化-->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/cloud/data/hadoop/tmp</value>
    </property>
    <!—这个配置文件描述了集群的namenode节点的url,这里采用HA代表默认逻辑名,集群中的每个datanode节点都需要知道namenode的地址,数据才可以被使用-->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://zzg</value>
    </property>
    <!-- zookeeper集群的地址和端口,最好保持基数个至少3台-->
     <property>
        <name>ha.zookeeper.quorum</name>
        <value>master01:2181,slave01:2181,slave02:2181</value>
    </property>
      
    (2)hdfs-site.xml配置
    <!—hadoop namenode数据的存储目录,只是针对与namenode,包含了namenode的系统信息元数据信息-->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/usr/local/cloud/data/hadoop/dfs/nn</value>
    </property>
    <!—datanode 要存储到数据到本地的路径,不必每一台机器都一样,但是为了方便管理最好还是一样-->
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/usr/local/cloud/data/hadoop/dfs/dn</value>
    </property>
    <!—系统中文件备份数量,系统默认是3分-->
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <!-- dfs.webhdfs.enabled 置为true,否则一些命令无法使用如:webhdfs的LISTSTATUS -->
    <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
    </property>
    <!—可选,关闭权限带来一些不必要的麻烦-->
    <property>
         <name>dfs.permissions</name>
         <value>false</value>
    </property>
    <!—可选,关闭权限带来一些不必要的麻烦-->
    <property>
         <name>dfs.permissions.enabled</name>
         <value>false</value>
    </property>
    <!—HA配置-->
    <!—设置集群的逻辑名-->
    <property>
        <name>dfs.nameservices</name>
        <value>zzg</value>
    </property>
    <!—hdfs联邦集群中的namenode节点逻辑名-->
    <property>
        <name>dfs.ha.namenodes.zzg</name>
        <value>nn1,nn2</value>
    </property>
    <!—hdfs namenode逻辑名中RPC配置,rpc 简单理解为序列化文件上传输出文件要用到-->
    <property>
        <name>dfs.namenode.rpc-address.zzg.nn1</name>
        <value>master01:9000</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address.zzg.nn2</name>
        <value>master02:9000</value>
    </property>
    <!—配置hadoop页面访问端口端口-->
    <property>
        <name>dfs.namenode.http-address.zzg.nn1</name>
        <value>master01:50070</value>
    </property>
    <property>
        <name>dfs.namenode.http-address.zzg.nn2</name>
        <value>master02:50070</value>
    </property>
    <!—建立与namenode的通信-->
    <property>
        <name>dfs.namenode.servicerpc-address.zzg.nn1</name>
        <value>master01:53310</value>
    </property>
    <property>
        <name>dfs.namenode.servicerpc-address.zzg.nn2</name>
        <value>master02:53310</value>
    </property>
    <!—journalnode 共享文件集群-->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://master01:8485;slave01:8485;slave02:8485/zzg</value>
    </property>
     <!—journalnode对namenode的进行共享设置-->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/usr/local/cloud/data/hadoop/ha/journal</value>
    </property>
    <!—设置故障处理类-->
    <property>
        <name>dfs.client.failover.proxy.provider.zzg</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <!—开启自动切换-->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>
    <property>
            <name>ha.zookeeper.quorum</name>
            <value>master01:2181,slave01:2181,slave02:2181</value>
    </property>
    <!—使用ssh方式进行故障切换-->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
    </property>
    <!—ssh通信密码通信位置-->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/root/.ssh/id_rsa</value>
    </property>
    5.3 配置maped-site.xml
    <property>
                    <name>mapreduce.framework.name</name>
                    <value>yarn</value>
    </property>
    5.4配置yarn HA 
    配置yarn-en.sh java环境
    # some Java parameters
      export JAVA_HOME=/usr/java/jdk1.7.0_55
    5.5配置yarn-site.xml
            <!—rm失联后重新链接的时间-->
            <property>
                    <name>yarn.resourcemanager.connect.retry-interval.ms</name>
                    <value>2000</value>
            </property>
            <!—开启resource manager HA,默认为false-->
             <property>
                    <name>yarn.resourcemanager.ha.enabled</name>
                    <value>true</value>
            </property>
            <!—开启故障自动切换-->
            <property>
                    <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
                    <value>true</value>
            </property>
            <!—配置resource manager -->
            <property>
                    <name>yarn.resourcemanager.ha.rm-ids</name>
                    <value>rm1,rm2</value>
            </property>
            <!—在master01上配置rm1,在master02上配置rm2,-->
            <property>
                    <name>yarn.resourcemanager.ha.id</name>
                    <value>rm1</value>
                   <description>If we want to launch more than one RM in single node, we need this configuration</description>
             </property>
            <!—开启自动恢复功能-->
             <property>
                    <name>yarn.resourcemanager.recovery.enabled</name>
                     <value>true</value>
            </property>
            <!—配置与zookeeper的连接地址-->
            <property>
                    <name>yarn.resourcemanager.zk-state-store.address</name>
                    <value>localhost:2181</value>
            </property>
      
            <property>
                    <name>yarn.resourcemanager.store.class</name>
                    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
            </property>
            <property>
                    <name>yarn.resourcemanager.zk-address</name>
                    <value>localhost:2181</value>
            </property>
            <property>
                    <name>yarn.resourcemanager.cluster-id</name>
                    <value>yarn-cluster</value>
            </property>
            <!—schelduler失联等待连接时间-->
             <property>
                    <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
                    <value>5000</value>
            </property>
            <!—配置rm1-->
            <property>
                    <name>yarn.resourcemanager.address.rm1</name>
                    <value>master01:23140</value>
            </property>
            <property>
                    <name>yarn.resourcemanager.scheduler.address.rm1</name>
                    <value>master01:23130</value>
            </property>
            <property>
                    <name>yarn.resourcemanager.webapp.address.rm1</name>
                    <value>master01:23188</value>
            </property>
            <property>
                    <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
                    <value>master01:23125</value>
            </property>
             <property>
                    <name>yarn.resourcemanager.admin.address.rm1</name>
                    <value>master01:23141</value>
            </property>
            <property>
                    <name>yarn.resourcemanager.ha.admin.address.rm1</name>
                    <value>master01:23142</value>
            </property>
            <!—配置rm2-->
             <property>
                    <name>yarn.resourcemanager.address.rm2</name>
                    <value>master02:23140</value>
            </property>
            <property>
                    <name>yarn.resourcemanager.scheduler.address.rm2</name>
                    <value>master02:23130</value>
            </property>
            <property>
                    <name>yarn.resourcemanager.webapp.address.rm2</name>
                    <value>master02:23188</value>
            </property>
            <property>
                    <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
                    <value>master02:23125</value>
            </property>
            <property>
                    <name>yarn.resourcemanager.admin.address.rm2</name>
                    <value>master02:23141</value>
            </property>
            <property>
                    <name>yarn.resourcemanager.ha.admin.address.rm2</name>
                    <value>master02:23142</value>
            </property>
            <!—配置nodemanager-->
            <property>
                    <description>Address where the localizer IPC is.</description>
                    <name>yarn.nodemanager.localizer.address</name>
                    <value>0.0.0.0:23344</value>
            </property>
            <!—nodemanager http访问端口-->
             <property>
                    <description>NM Webapp address.</description>
                    <name>yarn.nodemanager.webapp.address</name>
                    <value>0.0.0.0:23999</value>
            </property>
            <property>
                    <name>yarn.nodemanager.aux-services</name>
                    <value>mapreduce_shuffle</value>
            </property>
            <property>
                    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
                    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
            </property>
            <property>
                    <name>yarn.nodemanager.local-dirs</name>
                    <value>/usr/local/cloud/data/hadoop/yarn/local</value>
            </property>
            <property>
                    <name>yarn.nodemanager.log-dirs</name>
                    <value>/usr/local/cloud/data/logs/hadoop</value>
            </property>
            <property>
                    <name>mapreduce.shuffle.port</name>
                    <value>23080</value>
            </property>
            <!—故障处理类-->
             <property>
                    <name>yarn.client.failover-proxy-provider</name>
                     <value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
             </property>
    六、配置zookeeper集群
    在zookeeper目录下建立data目录 和logs目录,
    配置zoo.cnf
    dataDir=/usr/local/cloud/zookeeper/data
    dataLogDir=/usr/local/cloud/zookeeper/logs
    # the port at which the clients will connect
    clientPort=2181
    server.1=master01:2888:3888
    server.2=master02:2888:3888
    server.3=slave01:2888:3888
    server.4=slave02:2888:3888
    server.5=slave03:2888:3888
    在data目录下创建myid文件,并在对应的机器上填写数字,如上配置master01 server01 的myid写入1
    master02 中的data的myid写入2,依次在其他机子上执行相同操作。
    在各个机器下zookeeper目录下的bin目录下执行zkServer.sh start命令
    再运行zkServer.sh status如果出现leader 或fllower 则说明集群配置正确。
      
    到此各个配置文件配置完毕
    七、启动Hadoop集群严格按照以下顺序执行(第一次)
    (1)各个节点启动zookeeper,在zookeeper/bin/zkServer.sh start
    (2) 在hadoop/bin/hdfs zkfc –formatZK 进行格式化创建命名空间
    (3)在配置了journalnode的节点启动,master01,slave01,slave02
       在hadoop/sbin/hadoop-daemon.sh  journalnode
    4)在主namenode节点执行格式化
    ./bin/hadoop namenode -format zzg
     主机器上启动namenode
     hadoop/sbin/ hadoop-daemon.sh start namenode
    (5)将主namenode节点格式化的目录拷贝到从主namenode节点上
    hadoop/bin/hdfs namenode –bootstrapStandby
    hadoop/sbin/hadoop-daemon.sh start namenode
    (6) 在两个namenode节点都执行以下命令
    ./sbin/hadoop-daemon.sh start zkfc
    (7) 在所有datanode节点都执行以下命令启动datanode
    ./sbin/hadoop-daemon.sh start datanode
    (8)在主namenode节点启动yarn,运行yarn-start.sh命令
    jps可以看到
    namenode节点
    [root@master01 ~]# jps
    38972 JournalNode
    38758 NameNode
    39166 DFSZKFailoverController
    37473 QuorumPeerMain
    39778 ResourceManager
    42620 Jps
    datanode节点
    [root@slave01 ~]# jps
    33440 DataNode
    35277 Jps
    32681 QuorumPeerMain
    33568 JournalNode
    34231 NodeManager
  • 相关阅读:
    160. 两个链表的相交点 Intersection of Two Linked Lists
    单链表的C#实现
    14. 字符串数组的最长公共前缀 Longest Common Prefix
    67. 二进制字符串相加 Add Binary
    .NET框架中SortedSet源码(红黑树)
    Guest CPU model configuration in libvirt with QEMU/KVM
    libvirt cpu mode
    host capability
    Stacktack overview
    Installing StackTach
  • 原文地址:https://www.cnblogs.com/jamesf/p/4751512.html
Copyright © 2011-2022 走看看