zoukankan      html  css  js  c++  java
  • HA分布式集群一hadoop+zookeeper

    一:HA分布式配置的优势:

      1,防止由于一台namenode挂掉,集群失败的情形

      2,适合工业生产的需求

    二:HA安装步骤:

    1,安装虚拟机

     1,型号:VMware_workstation_full_12.5.0.11529.exe  linux镜像:CentOS-7-x86_64-DVD-1611.iso

      注意点:

      1,网络选择了桥接模式(可以防止route总变),(台式机或服务器最好设置自己的本机的ip地址为静态的ip)

      2,安装过程中选择了基础建设模式(infras...),(减少内存的消耗,但又保证基本的环境的模式)

      3,用户名root  密码 root
      4,网络配置使用了手动网络固定网络ip4地址(固定ip)

     2,linux基本环境配置:(操作都在root权限下进行的)

      1,验证网络服务:ping <主机ip>  主机 ping <虚拟机ip>  ping www.baidu.ok  验证ok

        备份ip地址:cp /etc/sysconfig/network-scripts/ifcfg-ens33 /etc/sysconfig/network-scripts/ifcfg-ens33.bak 

      2,防火墙设置:关闭并禁用防火墙

       关闭防火墙 systemctl stop firewalld.service(cetos7与前面系列的iptables不同)

          禁用防火墙:systemctl disable firewalld.service

          查看防火墙状态:firewall-cmd --state

      3,设置hosts,hostname,network

    vim /etc/hostname
    ha1
    
    vim /etc/hosts
    192.168.1.116    ha1
    192.168.1.117    ha2
    192.168.1.118    ha3
    192.168.1.119    ha4
    
    vim /etc/sysconfig/network
    NETWORKING=yes
    HOSTNAME=ha1

      4,安装一些必要的包:(不一定全)

    yum install -y chkconfig
    yum install -y Python
    yum install -y bind-utils
    yum install -y psmisc
    yum install -y libxslt
    yum install -y zlib
    yum install -y sqlite
    yum install -y cyrus-sasl-plain
    yum install -y cyrus-sasl-gssapi
    yum install -y fuse
    yum install -y portmap
    yum install -y fuse-libs
    yum install -y redhat-lsb
    View Code

      5,安装Java和Scala

    java版本:jdk-7u80-linux-x64.rpm
    scala版本:scala-2.11.6.tgz
    
    验证是否有java:
    rpm -qa|grep java 无
    
    tar -zxf jdk-8u111-linux-x64.tar.gz
    tar -zxf scala-2.11.6.tgz
    mv jdk1.8.0_111 /usr/java
    mv scala-2.11.6 /usr/scala
    
    配置环境变量:
    vim /etc/profile
    export JAVA_HOME=/usr/java
    export SCALA_HOME=/usr/scala
    export PATH=$JAVA_HOME/bin:$SCALA_HOME/bin:$PATH
    export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

      6,重启,验证上述是否设置  ok :重启  使用vm快照,命名为:初始化ok java,scala,主机名,防火墙,ip

     3,hadoop+zookeeper集群配置

      1,集群机准备

        连接克隆:对ha1克隆出ha2,ha3,ha4

        对ha2,ha3,ha4修改网络地址,network,防火墙
        vim /etc/sysconfig/network-scripts/ifcfg-ens33
        116 117/118/119
        service network restart
        vim /etc/hostname
        vim /etc/sysconfig/network
        systemctl disable firewalld.service

        对ha2,ha3,ha4重启验证ip,网络,防火墙,分别对三台机快照,命名为:初始化ok java,scala,主机名,防火墙,ip

      2,集群框架图

    机子

    Namenode

    DataNode

    Zookeeper

    ZkFC

    JournalNode

    RM

    DM

    Ha1

    1

    1

    1

    1

    1

    Ha2

    1

    1

    1

    1

    1

    1

    Ha3

    1

    1

    1

    1

    Ha4

    1

    1

       3,ssh通信: ok后  快照 ssh ok

    四台机:
    ssh-keygen -t rsa
    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    chmod 600 ~/.ssh/authorized_keys
    
    ha1下:
    scp ~/.ssh/* root@ha2:~/.ssh/
    scp ~/.ssh/* root@ha3:~/.ssh/
    scp ~/.ssh/* root@ha4:~/.ssh/
    
    验证:
    ssh ha2/ha3/ha4

      4,zookeeper集群配置:

       1,配置环境变量

    zook安装:
    tar -zxf zookeeper-3.4.8.tar.gz
    mv zookeeper-3.4.8 /usr/zookeeper-3.4.8
    修改配置文件:
    export ZK_HOME=/usr/zookeeper-3.4.8
    scp /etc/profile root@ha2:/etc/
    scp /etc/profile root@ha3:/etc/
    source /etc/profile

       2,zoo.cfg配置(加粗修改出)

    cd /usr/zookeeper-3.4.8/conf
    cp zoo_sample.cfg zoo.cfg
    内容:
    # The number of milliseconds of each tick
    tickTime=2000
    # The number of ticks that the initial
    # synchronization phase can take
    initLimit=10
    # The number of ticks that can pass between
    # sending a request and getting an acknowledgement
    syncLimit=5
    # the directory where the snapshot is stored.
    # do not use /tmp for storage, /tmp here is just
    # example sakes.
    dataDir=/opt/zookeeper/datas
    dataLogDir=/opt/zookeeper/logs
    # the port at which the clients will connect
    clientPort=2181
    # the maximum number of client connections.
    # increase this if you need to handle more clients
    #maxClientCnxns=60
    #
    # Be sure to read the maintenance section of the
    # administrator guide before turning on autopurge.
    #
    # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
    #
    # The number of snapshots to retain in dataDir
    #autopurge.snapRetainCount=3
    # Purge task interval in hours
    # Set to "0" to disable auto purge feature
    #autopurge.purgeInterval=1
    server.1=ha1:2888:3888
    server.2=ha2:2888:3888
    server.3=ha3:2888:3888

       3,启动zookeeper集群:

    #三台机(ha1,ha2,ha3)
    新建文件夹:
    mkdir -p /opt/zookeeper/datas
    mkdir -p /opt/zookeeper/logs
    cd /opt/zookeeper/datas
    vim myid    写1/2/3
    
    #分发给ha2,ha3(注意ha4不需要)
    cd /usr
    scp -r zookeeper-3.4.8 root@ha2:/usr
    scp -r zookeeper-3.4.8 root@ha3:/usr
    #启动(三台机)
    cd $ZK_HOME/bin
    zkServer.sh start
    zkServer.sh status 一个leader和连个follower

      5,hadoop集群配置

       1,配置环境变量:

    版本:hadoop-2.7.3.tar.gz
    
    tar -zxf hadoop-2.7.3.tar.gz
    mv hadoop2.7.3 /usr/hadoop2.7.3
    
    export JAVA_HOME=/usr/java
    export SCALA_HOME=/usr/scala
    export HADOOP_HOME=/usr/hadoop-2.7.3
    export PATH=$JAVA_HOME/bin:$SCALA_HOME/bin:$HADOOP_HOME/bin:$PATH
    export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    source /etc/profile

       2,hadoop.env.sh配置:

    export JAVA_HOME=/usr/java
    source hadoop.env.sh
    hadoop version    验证ok

       3,hdfs-site.xml配置:后续修改后发送(scp hdfs-site.xml root@ha4:/usr/hadoop-2.7.3/etc/hadoop/)

    vim hdfs-site.xml
    <configuration>
        <property>
                 <name>dfs.nameservices</name>
                 <value>mycluster</value>
        </property>
         <property>
                 <name>dfs.ha.namenodes.mycluster</name>
                 <value>nn1,nn2</value>
        </property>
         <property>
                 <name>dfs.namenode.rpc-address.mycluster.nn1</name>
                 <value>ha1:9000</value>
        </property>
         <property>
                 <name>dfs.namenode.rpc-address.mycluster.nn2</name>
                 <value>ha2:9000</value>
        </property>
         <property>
                 <name>dfs.namenode.http-address.mycluster.nn1</name>
                 <value>ha1:50070</value>
        </property>
         <property>
                 <name>dfs.namenode.http-address.mycluster.nn2</name>
                 <value>ha2:50070</value>
        </property>
         <property>
                 <name>dfs.namenode.shared.edits.dir</name>
                 <value>qjournal://ha2:8485;ha3:8485;ha4:8485/mycluster</value>
        </property>
         <property>
                 <name>dfs.client.failover.proxy.provider.mycluster</name>
                 <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        </property>
         <property>
                 <name>dfs.ha.fencing.methods</name>
                 <value>sshfence</value>
        </property>
         <property>
                 <name>dfs.ha.fencing.ssh.private-key-files</name>
                 <value>/root/.ssh/id_rsa</value>
        </property>
         <property>
                 <name>dfs.journalnode.edits.dir</name>
                 <value>/opt/jn/data</value>
        </property>
         <property>
                 <name>dfs.ha.automatic-failover.enabled</name>
                 <value>true</value>
        </property>
    </configuration>

       4,core-site.xml配置

    <configuration>
           <property>
                    <name>fs.defaultFS</name>
                    <value>hdfs://mycluster</value>
           </property>
           <property>
                    <name>ha.zookeeper.quorum</name>
                    <value>ha1:2181,ha2:2181,ha3:2181</value>
            </property>
           <property>
                   <name>hadoop.tmp.dir</name>
                   <value>/opt/hadoop2</value>
                   <description>A base for other temporary   directories.</description>
           </property>
    </configuration>

       5,yarn-site.xml配置

    vim yarn-site.xml
    <configuration>
    
    <!-- Site specific YARN configuration properties -->
            <property>
                   <name>yarn.nodemanager.aux-services</name>
                   <value>mapreduce_shuffle</value>
            </property>
            <property>                                                               
                    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
                    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
            </property>
            <property>
                   <name>yarn.resourcemanager.hostname</name>
                   <value>ha1</value>
           </property>
    </configuration>

       6,mapred-site.xml配置

    <configuration>         
    <property> 
          <name>mapreduce.framework.name</name>
              <value>yarn</value>
    </property>
    </configuration>

       7,slaves配置:

    vim slaves 
    ha2
    ha3
    ha4

       8,分发并启动:

    #分发
    scp -r hadoop-2.7.3 root@ha2:/usr/
    scp -r hadoop-2.7.3 root@ha3:/usr/
    scp -r hadoop-2.7.3 root@ha4:/usr/
    #启动JN(在ha2,ha3,ha4)
    cd sbin
    ./hadoop-daemon.sh start journalnode
    
    [root@ha2 sbin]# jps
    2646 JournalNode
    2695 Jps
    2287 QuorumPeerMain(#zk启动的线程)
    
    #ha1:namenode格式化
    cd bin
    ./hdfs namenode -format
    #zk格式化
    ./hdfs zkfc -formatZK
    #可以查看cd /opt/hadoop2文件来查看元数据是否格式化正常
    
    #ha2:namenode格式化
    1,ha1要先启动namenode:
    ./hadoop-daemon.sh start namenode
    2,ha2下
    ./hdfs namenode -bootstrapStandby

       9,验证:http://192.168.1.116:50070/验证 ok   快照 ha模式下的hadoop+zookeeper安装ok

    #hdfs集群验证
    [root@ha1 sbin]# ./stop-dfs.sh
    Stopping namenodes on [ha1 ha2]
    ha2: no namenode to stop
    ha1: stopping namenode
    ha2: no datanode to stop
    ha3: no datanode to stop
    ha4: no datanode to stop
    Stopping journal nodes [ha2 ha3 ha4]
    ha3: stopping journalnode
    ha4: stopping journalnode
    ha2: stopping journalnode
    Stopping ZK Failover Controllers on NN hosts [ha1 ha2]
    ha2: no zkfc to stop
    ha1: no zkfc to stop
    [root@ha1 sbin]# ./start-dfs.sh
    ha1下:
    [root@ha1 sbin]# jps
    3506 Jps
    3140 NameNode
    2255 QuorumPeerMain
    3439 DFSZKFailoverController
    [root@ha2 dfs]# jps
    31264 NameNode
    31556 DFSZKFailoverController
    31638 Jps
    31336 DataNode
    31436 JournalNode
    2287 QuorumPeerMain
    [root@ha3 sbin]# jps
    2290 QuorumPeerMain
    31074 DataNode
    31173 JournalNode
    31242 Jps
    [root@ha4 sbin]# jps
    31153 Jps
    30986 DataNode
    31084 JournalNode
    View Code
    配置yarn和mapred
    [root@ha1 sbin]# jps
    4614 NameNode
    4920 DFSZKFailoverController
    5356 Jps
    2255 QuorumPeerMain
    5103 ResourceManager
    [root@ha2 hadoop]# jps
    32320 DataNode
    32243 NameNode
    32548 DFSZKFailoverController
    32423 JournalNode
    32713 NodeManager
    32813 Jps
    2287 QuorumPeerMain
    [root@ha3 ~]# jps
    2290 QuorumPeerMain
    31495 DataNode
    31736 NodeManager
    31885 Jps
    31598 JournalNode
    [root@ha4 ~]# jps
    31505 JournalNode
    31641 NodeManager
    31404 DataNode
    31790 Jps
    View Code
  • 相关阅读:
    python3.6配置flask
    jquery匿名函数和闭包(它山之石)笔记
    .net扩展方法
    对象继承
    MAC OS X PKG FILES
    NLP——天池新闻文本分类 Task2
    Python基础TASK1:变量与数据类型
    NLP——天池新闻文本分类 Task1
    随机分析与随机过程中的一些基本概念
    Java线程池
  • 原文地址:https://www.cnblogs.com/ksWorld/p/7265630.html
Copyright © 2011-2022 走看看