zoukankan      html  css  js  c++  java
  • Hadoop2.2.0(yarn)编译部署手册

    Created on 2014-3-30
    URL : http://www.cnblogs.com/zhxfl/p/3633919.html
    @author: zhxfl

     

    Hadoop-2.2编译

    由于Hadoop-2.2只发布了32位的版本,所以如果是在64位操作系统上,需要重新编译

    安装 maven

    安装maven,查找其安装目录

    sudo apt-get intall maven
    Find /usr -name “*maven*” 

    根据其安装目录添加环境变量

    export M2_HOME=/usr/shared/maven
    export PATH=$PATH:$M2_HOME/bin 
    export MAVEN_OPTS="-Xms256m -Xmx512m"

    安装 google protobuf

    wget https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz
    
    /configure--prefix=/usr/local/protobuf   
    sudo make && sudo make install
    sudo vim /etc/ld.so.conf   [添加/usr/local/lib]
    ldconfig

    安装Cmake

    sudo apt-get install cmake

    安装依赖库

    libglib2.0-dev libssl-dev

    pom.xml 中有个bug,添加下面patch即可

    参考 https://issues.apache.org/jira/browse/HADOOP-10110

    Index: hadoop-common-project/hadoop-auth/pom.xml
    
    ===================================================================
    
    --- hadoop-common-project/hadoop-auth/pom.xml     (revision 1543124)
    
    +++ hadoop-common-project/hadoop-auth/pom.xml   (working copy)
    
    @@ -54,6 +54,11 @@
    
         </dependency>
    
         <dependency>
    
           <groupId>org.mortbay.jetty</groupId>
    
    +      <artifactId>jetty-util</artifactId>
    
    +      <scope>test</scope>
    
    +    </dependency>
    
    +    <dependency>
    
    +      <groupId>org.mortbay.jetty</groupId>
    
           <artifactId>jetty</artifactId>
    
           <scope>test</scope>
    
         </dependency>

    开始编译:

    mvn package -Pdist,native -DskipTests -Dtar 

    常见错误

    [ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (make) on project hadoop-hdfs: An Ant BuildException has occured: exec returned: 1 -> [Help 1]

    安装libglib2.0-dev

    [ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (make) on project hadoop-pipes: An Ant BuildException has occured: exec returned: 1 -> [Help 1]

    安装libssl-dev

    [ERROR] /home/yarn/hadoop-2.2.0-src/hadoop-common-project/hadoop-auth/src/test/j                                                                                        ava/org/apache/hadoop/security/authentication/client/AuthenticatorTestCase.java:                                                                                        [86,13] cannot access org.mortbay.component.AbstractLifeCycle

    参考 https://issues.apache.org/jira/browse/HADOOP-10110

    最后在目录~/hadoop-2.2.0-src/hadoop-dist/target中有一个hadoop-2.2.0的目录就是编译出来的版本

     

    Hadoop-2.2环境配置

    添加用户

    每个节点都添加yarn用户

    添加用户

    sudo adduser yarn

    把用户添加到hadoop组中(如果你没有一个hadoop的组,需要新建这个组)

    sudo gpasswd -a yarn hadoop

    给yarn用户添加sudo权限

    sudo vim /etc/sudoers

    添加如下语句

    yarn ALL=(ALL:ALL) ALL

    ssh配置

    针对master

    sudo apt-get install openssh-server

    ssh-keygen(一直按enter即可)

    在~/.ssh目录下,有一个id_rsa(私钥),一个id_rsa.pub(公钥)

    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

    输入ssh localhost确认是否需要输入密码

    拷贝authorized_keys 到slave1~slave3节点

    scp authorized_keys yarn@slave1:~/.ssh/

    scp authorized_keys yarn@slave2:~/.ssh/

    scp authorized_keys yarn@slave3:~/.ssh/

    export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/

    针对salves

    都要执行ssh-keygen

    最后在master节点上使用ssh slave1等测试

    其他

    每个节点上面都需要添加如下ip

    vim /etc/hosts

    219.219.216.48 master

    219.219.216.47 slave1

    219.219.216.45 slave2

    219.219.216.46 slave3

    参考

    http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-install/

    配置文件

    hadoop-env.sh

    添加JAVA_HOME环境变量

    export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/

    Core-site.xml

    <configuration>
            <property>
                    <name>fs.defaultFS</name>
                    <value>hdfs://master:8020</value>
                    <final>true</final>
            </property>
            <property>
                    <name>dfs.replication</name>
                    <value>2</value>
            </property>
            <property>
                    <name>hadoop.tmp.dir</name>
                    <value>/home/yarn/hadoop-files/tmp</value>
            </property>
    </configuration>

    Hdfs-site.xml

    <configuration>
            <property>
            <name>dfs.namenode.name.dir</name>
                    <value>/home/yarn/hadoop-files/name</value>
            </property>
            <property>
                    <name>dfs.datanode.data.dir</name>
                    <value>/home/yarn/hadoop-files/dfs/data</value>
            </property>
            <property>
                    <name>dfs.replication</name>
                    <value>2</value>
            </property>
            <property>
                    <name>hadoop.tmp.dir</name>
                    <value>/home/yarn/hadoop-files/tmp/</value>
                    <description>A base for other temporary directories.</description>
            </property>
    </configuration>

    红色部分根据你实际的目录而定。

    mapred-site.xml

    <configuration>
            <property>
                    <name>mapreduce.framework.name</name>
                    <value>yarn</value>
            </property>
            <property>
                    <name>mapred.child.java.opts</name>
                    <value>-Xmx1024m</value>
            </property>
    </configuration>

    yarn-site.xml

    <configuration>
       <property>
         <name>yarn.nodemanager.aux-services</name>
         <value>mapreduce_shuffle</value>
      </property>
      <property>
         <name>yarn.resourcemanager.address</name>
         <value>master:8032</value>
      </property>
      <property>
          <name>yarn.resourcemanager.resource-tracker.address</name>
          <value>master:8031</value>
      </property>
      <property>
          <name>yarn.resourcemanager.admin.address</name>
          <value>master:8033</value>
      </property>
      <property>
          <name>yarn.resourcemanager.scheduler.address</name>
          <value>master:8030</value>
      </property>
    
      <property>
          <name>yarn.web-proxy.address</name>
          <value>master:8888</value>
      </property>
      <property>
         <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
         <value>org.apache.hadoop.mapred.ShuffleHandler</value>
      </property>
      <property>
         <name>yarn.nodemanager.loacl-dirs</name>
         <value>/home/yarn/hadoop-2.2.0/hadoop-files/hadoop-loacl-dirs/</value>
         <final>true</final>
      </property>
    </configuration>

    其他可能需要设置的环境变量

    export M2_HOME=/usr/share/maven
    export PATH=$PATH:$M2_HOME/bin:~/hadoop-2.2.0/bin
    export MAVEN_OPTS="-Xms2048m -Xmx2048m"
    export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/
    export HADOOP_HOME="/home/yarn/hadoop-2.2.0"
    export HADOOP_PREFIX="/home/yarn/hadoop-2.2.0"
    export YARN_HOME=$HADOOP_HOME
    export HADOOP_MAPRED_HOME=$HADOOP_HOME
    export HADOOP_COMMON_HOME=$HADOOP_HOME
    export HADOOP_HDFS_HOME=$HADOOP_HOME
    export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop/
    export YARN_CONF_DIR=$HADOOP_CONF_DIR
    export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
    export SCALA_HOME=/usr/share/scala/
    export PATH=$SCALA_HOME/bin/:$PATH

    测试

    改变文件用户组属性

    ./hdfs dfs -chgrp -R yarn
    ./hdfs dfs -mkdir /yarn

    这样yarn文件夹就属于用户yarn的了

    编译和提交任务错误参考:

    http://www.cnblogs.com/lucius/p/3435296.html

  • 相关阅读:
    捕获组
    re.S解析
    Python eval 函数妙用
    Python tips: 什么是*args和**kwargs?
    HBase 的安装与配置
    HBase 基本操作
    HBase中的备份和故障恢复方法
    Hbase写数据,存数据,读数据的详细过程
    HBase shell
    HDFS的快照原理和Hbase基于快照的表修复
  • 原文地址:https://www.cnblogs.com/zhxfl/p/3633919.html
Copyright © 2011-2022 走看看