zoukankan      html  css  js  c++  java
  • Mac上搭建Hadoop环境(2) — Hadoop下载及安装

    前言

    Mac上搭建Hadoop环境(1) — 虚拟机的安装及SSH免密设置中,我们已经配置好了节点之间基本的网络环境,接下来,只要在master(MBP)上下载安装hadoop即可。

    下载hadoop

    可以前往Apache Hadoop官网,选择你想要的版本进行下载。 这里,我选择下载的是hadoop-2.7.7的binary版本。

    安装hadoop

    在下载完hadoop-2.7.7.tar.gz后,将其解压到你想安装的目录即可。 我将其解压到 /opt 文件夹下。

    sudo tar -C /opt -xvf ~/Downloads/hadoop-2.7.7.tar.gz;
    

    然后对目录重命名并创建相应的子目录

    sudo mv /opt/hadoop-2.7.7 /opt/hadoop;
    mkdir /opt/hadoop/dfs;
    mkdir /opt/hadoop/dfs/name;
    mkdir /opt/hadoop/dfs/data;
    mkdir /opt/hadoop/tmp;
    

    配置hadoop

    hadoop-env.sh

    编辑 /opt/hadoop/etc/hadoop/hadoop-env.sh,设置JAVA_HOME值,如下

    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home
    

    yarn-env.sh

    编辑 /opt/hadoop/etc/hadoop/yarn-env.sh,设置JAVA_HOME值,如下

    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home
    

    slaves

    编辑 /opt/hadoop/etc/hadoop/slaves,加入之前设置的slave节点的hostname:

    slave1
    slave2
    

    更新*.xml配置文件

    core-site.xml

    <configuration>
           <property>
                    <name>fs.defaultFS</name>
                    <value>hdfs://master:8020</value>
           </property>
           <property>
                    <name>io.file.buffer.size</name>
                    <value>131072</value>
            </property>
           <property>
                   <name>hadoop.tmp.dir</name>
                   <value>file:/opt/hadoop/tmp</value>
                   <description>Abase for other temporary directories.</description>
           </property>
            <property>
                   <name>hadoop.proxyuser.lestat.hosts</name>
                   <value>*</value>
           </property>
           <property>
                   <name>hadoop.proxyuser.lestat.groups</name>
                   <value>*</value>
           </property>
    </configuration>
    

    hdfs-site.xml

    <configuration>
           <property>
                    <name>dfs.namenode.secondary.http-address</name>
                   <value>master:9001</value>
           </property>
         <property>
                 <name>dfs.namenode.name.dir</name>
                 <value>file:/opt/hadoop/dfs/name</value>
           </property>
          <property>
                  <name>dfs.datanode.data.dir</name>
                  <value>file:/opt/hadoop/dfs/data</value>
           </property>
           <property>
                   <name>dfs.replication</name>
                   <value>3</value>
            </property>
            <property>
                     <name>dfs.webhdfs.enabled</name>
                      <value>true</value>
             </property>
    </configuration>
    
    
    

    mapred-site.xml

    <configuration>
          <property>                                                                  
              <name>mapreduce.framework.name</name>
                    <value>yarn</value>
               </property>
              <property>
                      <name>mapreduce.jobhistory.address</name>
                      <value>master:10020</value>
              </property>
              <property>
                    <name>mapreduce.jobhistory.webapp.address</name>
                    <value>master:19888</value>
           </property>
    </configuration>
    
    

    yarn-site.xml

    <configuration>
            <property>
                   <name>yarn.nodemanager.aux-services</name>
                   <value>mapreduce_shuffle</value>
            </property>
            <property>                                                                
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
                   <value>org.apache.hadoop.mapred.ShuffleHandler</value>
            </property>
            <property>
                   <name>yarn.resourcemanager.address</name>
                   <value>master:8032</value>
           </property>
           <property>
                   <name>yarn.resourcemanager.scheduler.address</name>
                   <value>master:8030</value>
           </property>
           <property>
                <name>yarn.resourcemanager.resource-tracker.address</name>
                 <value>master:8031</value>
          </property>
          <property>
                  <name>yarn.resourcemanager.admin.address</name>
                   <value>master:8033</value>
           </property>
           <property>
                   <name>yarn.resourcemanager.webapp.address</name>
                   <value>master:8088</value>
           </property>
    </configuration>
    

    将hadoop复制到slave1和slave1上

    scp -r /opt/hadoop parallels@slave1:~/
    scp -r /opt/hadoop parallels@slave2:~/
    

    然后分别在slave节点执行

    sudo mv ~/hadoop /opt/hadoop;
    sudo chown -R parallels:parallels /opt/hadoop
    

    修改slave上的hadoop-env.sh和yarn-env.sh中的JAVA_HOME值

    export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk-1.8.0.151-1.b12.el7_4.x86_64
    

    添加hadoop环境变量

    export PATH=$PATH:/opt/hadoop/bin:/opt/hadoop/sbin
    

    启动hadoop

    在启动之前,先执行hadoop namenode -format 进行格式化,
    然后执行 start-all.sh来启动hadoop集群,以下是我的启动日志

    Lestats-MBP:~ lestat$ start-all.sh
    This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
    19/03/24 09:57:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Starting namenodes on [master]
    master: namenode running as process 45208. Stop it first.
    slave1: starting datanode, logging to /opt/hadoop/logs/hadoop-parallels-datanode-slave1.out
    slave2: starting datanode, logging to /opt/hadoop/logs/hadoop-parallels-datanode-slave2.out
    Starting secondary namenodes [master]
    master: secondarynamenode running as process 45331. Stop it first.
    19/03/24 09:57:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    starting yarn daemons
    resourcemanager running as process 45435. Stop it first.
    slave1: starting nodemanager, logging to /opt/hadoop/logs/yarn-parallels-nodemanager-slave1.out
    slave2: starting nodemanager, logging to /opt/hadoop/logs/yarn-parallels-nodemanager-slave2.out
    

    然后可以执行master上jps来查看启动的JVM。

    Lestats-MBP:~ lestat$ jps
    45331 SecondaryNameNode
    45846 Jps
    45208 NameNode
    45435 ResourceManager
    

    执行slave上jps来查看启动的JVM。
    如果jps命令找不到, 可以运行sudo yum install java-1.8.0-openjdk-devel进行安装

    [parallels@slave1 ~]$ jps
    28832 Jps
    26454 DataNode
    26590 NodeManager
    
    [parallels@slave2 ~]$ jps
    26034 DataNode
    26180 NodeManager
    27931 Jps
    

    这样hadoop伪分布式集群的安装就告一段落了。

    测试

    例如我们执行一些hadoop fs的命令:

    lestats-MBP:~ lestat$ hadoop fs -ls /
    19/03/24 10:09:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Lestats-MBP:~ lestat$ hadoop fs -mkdir /input
    19/03/24 10:09:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Lestats-MBP:~ lestat$ hadoop fs -ls /
    19/03/24 10:09:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Found 1 items
    drwxr-xr-x   - lestat supergroup          0 2019-03-24 10:09 /input
    
  • 相关阅读:
    批量ping工具fping
    图形文件元数据管理工具exiv2
    JPG图片EXIF信息提取工具exif
    网络图片嗅探工具driftnet
    复杂密码生成工具apg
    前端面试题目准备
    JS中同步与异步的理解
    angular初体验
    媒体查询的两种方式
    CSS3多列布局
  • 原文地址:https://www.cnblogs.com/lestatzhang/p/10611298.html
Copyright © 2011-2022 走看看