zoukankan      html  css  js  c++  java
  • 搭建基于ubuntu14.04麒麟的hadoop单机测试环境

    使用hadoop版本:2.2.0

    安装下载啥的就不嘀咕了,直接从配置开始:

    hadoop需要配置的有以下几个文件,都在$HADOOP_HOME/etc/hadoop/:

    hadoop-env.sh:里面有个JAVA_HOME的,配置到JDK的位置

    core-site.xml:将以下代码插入到configuration中间

    <property>
    
      <name>hadoop.tmp.dir</name>
    
     <value>/home/username/kit/hadoop/data/temp/</value>
    
    </property>
    
    <property>
    
     <name>fs.default.name</name>
    
     <value>hdfs://localhost:9000</value>
     <final>true</final>
    
    </property>

    hdfs-site.xml:代码如下:

    <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:///home/username/kit/hadoop/namenode/</value>
    <final>true</final>
    </property>
    
    <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:///home/username/kit/hadoop/datanode/</value>
    <final>true</final>
    </property>
    
    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    
    <property>
    <name>dfs.permissions.enabled</name>
    <value>false</value>
    </property>

    mapred-site.xml:这个是复制一个mapred-site.xml.template,然后改名,然后写入如下代码:

      <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
        </property>

    yarn-site.xml:这个略多,有些可能不必要,从别处抄的,就全加上了

    <property>
          <name>yarn.resourcemanager.hostname</name>
          <value>localhost</value>
          <description>hostanem of RM</description>
        </property>
    
    
        <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>localhost:5274</value>
        <description>host is the hostname of the resource manager and 
        port is the port on which the NodeManagers contact the Resource Manager.
        </description>
      </property>
    
      <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>localhost:5273</value>
        <description>host is the hostname of the resourcemanager and port is the port
        on which the Applications in the cluster talk to the Resource Manager.
        </description>
      </property>
    
      <property>
        <name>yarn.resourcemanager.scheduler.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
        <description>In case you do not want to use the default scheduler</description>
      </property>
    
      <property>
        <name>yarn.resourcemanager.address</name>
        <value>localhost:5271</value>
        <description>the host is the hostname of the ResourceManager and the port is the port on
        which the clients can talk to the Resource Manager. </description>
      </property>
    
      <property>
        <name>yarn.nodemanager.local-dirs</name>
        <value></value>
        <description>the local directories used by the nodemanager</description>
      </property>
    
      <property>
        <name>yarn.nodemanager.address</name>
        <value>localhost:5272</value>
        <description>the nodemanagers bind to this port</description>
      </property>  
    
      <property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>10240</value>
        <description>the amount of memory on the NodeManager in GB</description>
      </property>
     
      <property>
        <name>yarn.nodemanager.remote-app-log-dir</name>
        <value>/app-logs</value>
        <description>directory on hdfs where the application logs are moved to </description>
      </property>
    
       <property>
        <name>yarn.nodemanager.log-dirs</name>
        <value></value>
        <description>the directories used by Nodemanagers as log directories</description>
      </property>
    
      <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
        <description>shuffle service that needs to be set for Map Reduce to run </description>
      </property>

    把这几个文件配置好后,基本就大功告成了。

    如果系统是64位的,需要将$HADOOP_HOME/lib/native/的文件替换为64位版本的,这个可以自己下载源码编译,具体请百度搜索,网上也有大神编译好的文件可以拿来替换。

    然后是ssh的安装,因为系统自带有openssh-client,安装一个openssh-server就可以了。

    ssh有个免密码的设置,可以省去超多的麻烦,下文的设置只适用于单机:

    $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

    $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

    注意第一行中间那是两个单引号!

    然后在/etc/profile文件中加入如下语句:

    export HADOOP_HOME=/home/shizhida/kit/hadoop-2.2.0
    export PATH=$HADOOP_HOME/bin:$PATH

    将hadoop的路径加入到环境变量,可以省去超多麻烦有木有

    至此安装基本完成,请重启后输入:

    $hadoop namenode -format

    进行最初的格式化。然后该干啥干啥吧~

  • 相关阅读:
    jQuery源码 support
    jQuery 源码: 延迟对象补充。
    web FG interview all
    Img load
    浅谈js中this指向问题
    浅谈ES6原生Promise
    BootStrap的两种模态框方式
    让div盒子相对父盒子垂直居中的几种方法
    normalize与reset
    JS实现继承的方式
  • 原文地址:https://www.cnblogs.com/Ayanami-Blob/p/3675561.html
Copyright © 2011-2022 走看看