zoukankan      html  css  js  c++  java
  • hadoop学习笔记(二):centos7三节点安装hadoop2.7.0

    环境win7+vamvare10+centos7

    一、新建三台centos7 64位的虚拟机

    master   192.168.137.100   root/123456
    node1   192.168.137.101   root/123456
    node2   192.168.137.102  root/123456

    二、关闭三台虚拟机的防火墙,在每台虚拟机里面执行:

    systemctl stop firewalld.service
    systemctl disable firewalld.service

    三、在三台虚拟机里面的/etc/hosts添加三行

    192.168.137.100     master
    192.168.137.101     node1
    192.168.137.102     node2

    四、为三台机器设置ssh免密登录

      1、CentOS7默认没有启动ssh无密登录,去掉/etc/ssh/sshd_config其中1行的注释,每台服务器都要设置

    #PubkeyAuthentication yes

     然后重启ssh服务

    systemctl restart sshd

    2、在master机器的/root执行:ssh-keygen -t rsa命令,一直按回车。三台机器都要执行。

    [root@master ~]# ssh-keygen -t rsa
    Generating public/private rsa key pair.
    Enter file in which to save the key (/root/.ssh/id_rsa): 
    Created directory '/root/.ssh'.
    Enter passphrase (empty for no passphrase): 
    Enter same passphrase again: 
    Your identification has been saved in /root/.ssh/id_rsa.
    Your public key has been saved in /root/.ssh/id_rsa.pub.
    The key fingerprint is:
    SHA256:aMUO8b/EkylqTMb9+71ePnQv0CWQohsaMeAbMH+t87M root@master
    The key's randomart image is:
    +---[RSA 2048]----+
    |  o ...      .   |
    |   =  o=  . o    |
    |    + oo=. . .   |
    |     =.Boo o  . .|
    |    . OoSoB  . o |
    |     =.+.+ o. ...|
    |      + o o  .. +|
    |     .   o . ..+.|
    |        E ....+oo|
    +----[SHA256]-----+

      3、在master上合并公钥到authorized_keys文件

    [root@master ~]# cd /root/.ssh/
    [root@master .ssh]# ll
    total 8
    -rw-------. 1 root root 1679 Apr 19 11:10 id_rsa
    -rw-r--r--. 1 root root  393 Apr 19 11:10 id_rsa.pub
    [root@master .ssh]# cat id_rsa.pub>> authorized_keys

      4、将master的authorized_keys复制到node1和node2节点

    scp /root/.ssh/authorized_keys root@192.168.137.101:/root/.ssh/
    scp /root/.ssh/authorized_keys root@192.168.137.102:/root/.ssh/

      5、测试:

    [root@master ~]# ssh root@192.168.137.101
    Last login: Thu Apr 19 11:41:23 2018 from 192.168.137.100
    [root@node1 ~]# 
    [root@master ~]# ssh root@192.168.137.102
    Last login: Mon Apr 23 10:40:38 2018 from 192.168.137.1
    [root@node2 ~]# 

    五、为三台机器安装jdk

    1、jdk下载地址:https://pan.baidu.com/s/1-fhy_zbGbEXR1SBK8V7aNQ

    2、创建目录:/home/java

    mkdir -p /home/java

    3、将下载的文件jdk-7u79-linux-x64.tar.gz,放到/home/java底下,并执行以下命令:

    tar -zxf jdk-7u79-linux-x64.tar.gz
    rm -rf tar -zxf jdk-7u79-linux-x64.tar.gz

    4、配置环境变量:

     vi /etc/profile,添加以下内容

    export JAVA_HOME=/home/java/jdk1.7.0_79
    export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    export PATH=$PATH:$JAVA_HOME/bin

    然后:source /etc/profile

    测试:

    [root@master jdk1.7.0_79]# java -version
    java version "1.7.0_79"
    Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
    Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
    [root@master jdk1.7.0_79]#

    六、安装hadoop 2.7(只在Master服务器解压,再复制到Slave服务器)

     1、创建/home/hadoop目录

    mkdir -p /home/hadoop

    2、将hadoop-2.7.0.tar.gz放到/home/hadoop下并解压

    tar -zxf hadoop-2.7.0.tar.gz

    3、在/home/hadoop目录下创建数据存放的文件夹,tmp、hdfs/data、hdfs/name

    [root@master hadoop]# mkdir tmp
    [root@master hadoop]# mkdir -p hdfs/data
    [root@master hadoop]# mkdir -p hdfs/name

    4、配置配置/home/hadoop/hadoop-2.7.0/etc/hadoop目录下的core-site.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
        <property>
           <!-- 这里的值指的是默认的HDFS路径。当有多个HDFS集群同时工作时,用户如果不写集群名称,那么默认使用哪个哪?在这里指定!该值来自于hdfs-site.xml中的配置 -->
            <name>fs.defaultFS</name>
            <value>hdfs://192.168.137.100:9000</value>
        </property>
        <property>
           <name>fs.default.name</name>
           <value>hdfs://192.168.137.100:9000</value>
        </property>
        <property>
            <name>hadoop.tmp.dir</name>
            <value>file:/home/hadoop/tmp</value>
        </property>
        <property>
           <!--流文件的缓冲区-->
            <name>io.file.buffer.size</name>
            <value>131702</value>
        </property>
    </configuration>

    5、配置/home/hadoop/hadoop-2.7.0/etc/hadoop目录下的hdfs-site.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <configuration>
        <!-- 设置secondarynamenode的http通讯地址 -->
        <property>
            <name>dfs.namenode.secondary.http-address</name>
            <value>192.168.137.100:9001</value>
        </property>
        <property>
            <name>dfs.namenode.name.dir</name>
            <value>file:/home/hadoop/dfs/name</value>
        </property>
        <property>
            <name>dfs.datanode.data.dir</name>
            <value>file:/home/hadoop/dfs/data</value>
        </property>
    <!--指定DataNode存储block的副本数量-->
        <property>
            <name>dfs.replication</name>
            <value>2</value>
        </property>
    <!--这里抽象出两个NameService实际上就是给这个HDFS集群起了个别名-->
        <property>
            <name>dfs.nameservices</name>
            <value>mycluster</value>
        </property>
        <property>
            <name>dfs.webhdfs.enabled</name>
            <value>true</value>
        </property>
    </configuration>

    6、将/home/hadoop/hadoop-2.7.0/etc/hadoop目录下的mapred-site.xml.template复制一份,并命名成mapred-site.xml

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <configuration>
        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
        <property>
            <name>mapreduce.jobhistory.address</name>
            <value>192.168.137.100:10020</value>
        </property>
        <property>
            <name>mapreduce.jobhistory.webapp.address</name>
            <value>192.168.137.100:19888</value>
        </property>
    </configuration>

    7、配置/home/hadoop/hadoop-2.7.0/etc/hadoop目录下的yarn-site.xml

    <?xml version="1.0"?>
    <configuration>
    <!--NodeManager上运行的附属服务。需配置成mapreduce_shuffle,才可运行MapReduce程序,默认值:“”-->
       <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
        <property>
            <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
            <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
    <!--ResourceManager 对客户端暴露的地址。客户端通过该地址向RM提交应用程序,杀死应用程序等
         默认值:${yarn.resourcemanager.hostname}:8032-->
        <property>
            <name>yarn.resourcemanager.address</name>
            <value>192.168.137.100:8032</value>
        </property>
    <!--ResourceManager 对ApplicationMaster暴露的访问地址。ApplicationMaster通过该地址向RM申请资源、释放资源等。
         默认值:${yarn.resourcemanager.hostname}:8030-->
        <property>
            <name>yarn.resourcemanager.scheduler.address</name>
            <value>192.168.137.100:8030</value>
        </property>
    <!--ResourceManager 对NodeManager暴露的地址。NodeManager通过该地址向RM汇报心跳,领取任务等。
         默认值:${yarn.resourcemanager.hostname}:8031-->
        <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
            <value>192.168.137.100:8031</value>
        </property>
    <!--ResourceManager 对管理员暴露的访问地址。管理员通过该地址向RM发送管理命令等。
         默认值:${yarn.resourcemanager.hostname}:8033-->
        <property>
            <name>yarn.resourcemanager.admin.address</name>
            <value>192.168.137.100:8033</value>
        </property>
    <!--ResourceManager对外web ui地址。用户可通过该地址在浏览器中查看集群各类信息。
         默认值:${yarn.resourcemanager.hostname}:8088-->
        <property>
            <name>yarn.resourcemanager.webapp.address</name>
            <value>192.168.137.100:8088</value>
        </property>
    <!--NodeManager总的可用物理内存。注意,该参数是不可修改的,一旦设置,整个运行过程中不可动态修改。另外,该参数的默认值是8192MB,即使你的机器内存不够8192MB,YARN也会按照这些内存来使用(傻不傻?),因此,这个值通过一定要配置。-->
        <property>
            <name>yarn.nodemanager.resource.memory-mb</name>
            <value>768</value>
        </property>
    </configuration>

    8、配置/home/hadoop/hadoop-2.7.0/etc/hadoop目录下hadoop-env.sh、yarn-env.sh的JAVA_HOME,不设置的话,启动不了

    export JAVA_HOME=/home/java/jdk1.7.0_79

    9、配置/home/hadoop/hadoop-2.7.0/etc/hadoop目录下的slaves,删除默认的localhost,增加2个从节点

    192.168.137.101
    192.168.137.102

    10、将配置好的Hadoop复制到各个节点对应位置上

    scp -r /home/hadoop 192.168.137.101:/home/
    scp -r /home/hadoop 192.168.137.102:/home/

    11、在Master服务器启动hadoop,从节点会自动启动,进入/home/hadoop/hadoop-2.7.0

      1)初始化:bin/hdfs namenode -format

      2)全部启动sbin/start-all.sh,也可以分开sbin/start-dfs.sh、sbin/start-yarn.sh

    [root@master hadoop-2.7.0]# sbin/start-all.sh
    This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
    Starting namenodes on [master]
    master: starting namenode, logging to /home/hadoop/hadoop-2.7.0/logs/hadoop-root-namenode-master.out
    192.168.137.101: starting datanode, logging to /home/hadoop/hadoop-2.7.0/logs/hadoop-root-datanode-node1.out
    192.168.137.102: starting datanode, logging to /home/hadoop/hadoop-2.7.0/logs/hadoop-root-datanode-node2.out
    starting yarn daemons
    starting resourcemanager, logging to /home/hadoop/hadoop-2.7.0/logs/yarn-root-resourcemanager-master.out
    192.168.137.101: starting nodemanager, logging to /home/hadoop/hadoop-2.7.0/logs/yarn-root-nodemanager-node1.out
    192.168.137.102: starting nodemanager, logging to /home/hadoop/hadoop-2.7.0/logs/yarn-root-nodemanager-node2.out

     3)停止的话,输入命令,sbin/stop-all.sh

     4)输入命令,jps,可以看到相关信息

    [root@master hadoop-2.7.0]# jps
    1765 ResourceManager
    2025 Jps

    12、浏览器查看

      1)resourcemanager.webapp.address的界面

     

     2)namenode的界面

  • 相关阅读:
    隔离级别 && SNAPSHOT
    多态性&& 虚函数 && 抽象类
    socket编程
    [APIO2015]巴邻旁之桥
    LuoguP3701 「伪模板」主席树
    线段树标记永久化
    [HNOI2015]开店
    NOIP2017划水记
    FFTNTT总结
    [THUWC 2017]在美妙的数学王国中畅游
  • 原文地址:https://www.cnblogs.com/boshen-hzb/p/8886959.html
Copyright © 2011-2022 走看看