zoukankan      html  css  js  c++  java
  • hadoop 2.7.3伪分布式安装

    hadoop 2.7.3伪分布式安装

    hadoop集群的伪分布式部署由于只需要一台服务器,在测试,开发过程中还是很方便实用的,有必要将搭建伪分布式的过程记录下来,好记性不如烂笔头。
    hadoop 2.7.3
    JDK 1.8.91

    到Apache的官网下载hadoop的二进制安装包。

    cd /home/fuxin.zhao/soft
    tar -czvf hadoop 2.7.3.tar.gz
    cd hadoop-2.7.3
    cd etc/hadoop/
    pwd

    1. 建立本机到本机的免密登录

    ssh-keygen -t rsa -P ""
    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    ssh localhost
    

    1. 修改hadoop的配置文件

    位于$HADOOP_HOME/conf目录下的修改四个配置文件:slaves、core-site.xml
    hdfs-site.xml 、mapred-site.xml 、 yarn-site.xml

    vi etc/hadoop/yarn-env.sh

    export JAVA_HOME=/usr/local/jdk
    

    vi etc/hadoop/hadoop-env.sh

    export JAVA_HOME=/usr/local/jdk
    

    vi slaves

    ##加入本机的hostname
    fuxin.zhao@ubuntuServer01:~/soft/hadoop-2.7.3/etc/hadoop$ vi slaves 
    ubuntuServer01
    

    vi core-site.xml

    <configuration>
     <property>
       <name>fs.defaultFS</name>
       <value>hdfs://ubuntuServer01:9000</value>
     </property>
     <property>
       <name>hadoop.tmp.dir</name>
       <value>file:/home/fuxin.zhao/hadoop/tmp</value>
       <description>Abase for other temporary directories.</description>
     </property>
    </configuration>
    

    vi hdfs-site.xml:

    <configuration>
        <property>
             <name>dfs.replication</name>
             <value>1</value>
        </property>
        <property>
             <name>dfs.namenode.name.dir</name>
             <value>file:/home/fuxin.zhao/hadoop/tmp/dfs/name</value>
        </property>
        <property>
             <name>dfs.datanode.data.dir</name>
             <value>file:/home/fuxin.zhao/hadoop/tmp/dfs/data</value>
        </property>
       <property>
        <name>dfs.block.size</name>
        <value>67108864</value>
       </property>
    </configuration>
    

    vi yarn-site.xml

    <configuration>
    <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
    </property>
    <property>
      <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
      <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>512</value>
    </property>
    <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>2048</value>
    </property>
    <property>
      <name>yarn.scheduler.minimum-allocation-vcores</name>
      <value>1</value>
    </property>
    <property>
      <name>yarn.scheduler.maximum-allocation-vcores</name>
      <value>2</value>
    </property>
    </configuration>
    

    vi mapred-site.xml

    <configuration>
    <property>
    	<name>mapreduce.framework.name</name>
    	<value>yarn</value>
    </property>
    <property>
    	<name>yarn.app.mapreduce.am.resource.mb</name>
    	<value>512</value>
    </property>
    <property>
    	<name>mapreduce.map.memory.mb</name>
    	<value>512</value>
    </property>
    <property>
    	<name>mapreduce.reduce.memory.mb</name>
    	<value>512</value>
    </property>
    </configuration>
    

    vi .bashrc

    export JAVA_HOME=/usr/local/jdk
    export HADOOP_HOME=/home/fuxin.zhao/soft/hadoop-2.7.3
    export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
    

    配置完成后,执行 NameNode 的格式化:

    ./bin/hdfs namenode -format
    ./sbin/start-dfs.sh
    ./sbin/start-yarn.sh
    mr-jobhistory-daemon.sh start historyserver

    查看hdfs的web页面:
    http://ubuntuserver01:50070/
    http://ubuntuserver01:8088/

    hadoop fs -ls /
    hadoop fs -mkdir /user
    hadoop fs -mkdir /user/fuxin.zhao
    hadoop fs -touchz textFile

    运行官方自带的测试job(teragen and terasort):

    测试job(teragen and terasort)
    #在/tmp/terasort/1000000下生成100M数据
    hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar teragen 1000000 /tmp/terasort/1000000-input
    
    #排序,输出到/tmp/terasort/1000000-output
    hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar terasort /tmp/terasort/1000000-input /tmp/terasort/1000000-output
    
    #删除临时文件
    hadoop fs -rm -r /tmp/terasort/1000000-input
    hadoop fs -rm -r /tmp/terasort/1000000-output
    
    
  • 相关阅读:
    洛谷 P1226 【模板】快速幂||取余运算 题解
    洛谷 P2678 跳石头 题解
    洛谷 P2615 神奇的幻方 题解
    洛谷 P1083 借教室 题解
    洛谷 P1076 寻宝 题解
    洛谷 UVA10298 Power Strings 题解
    洛谷 P3375 【模板】KMP字符串匹配 题解
    Kafka Shell基本命令
    Mybatis与Hibernate的详细对比
    MyBatis简介
  • 原文地址:https://www.cnblogs.com/honeybee/p/6400709.html
Copyright © 2011-2022 走看看