zoukankan      html  css  js  c++  java
  • hadoop2.2.0 centos6.4 编译安装详解

    搭建环境:Centos x 6.4 64bit

    1、安装JDK

    我这里用的是64位机,要下载对应的64位的JDK,下载地址:http://www.oracle.com/technetwork/cn/java/javase/downloads/jdk7-downloads-1880260-zhs.html,选择对应的JDK版本,解压JDK,然后配置环境变量,

     vim /etc/profile    

    注:这里有的人喜欢配置在当前用户里,我这里是配置的全局。 

    export PATH  
    export JAVA_HOME=/opt/jdk1.7  
    export PATH=$PATH:$JAVA_HOME/bin   

     source /etc/profile  

    测试下JDK是否安装成功: java -version

    java version "1.7.0_45"  
    Java(TM) SE Runtime Environment (build 1.7.0_45-b18)  
    Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)   
    

      

    2、编译前的准备(maven)

    maven官方下载地址,可以选择源码编码安装,这里就直接下载编译好的 就可以了

     
    wget http://mirror.bit.edu.cn/apache/maven/maven-3/3.1.1/binaries/apache-maven-3.1.1-bin.zip  

    解压文件后,同样在/etc/profie里配置环境变量

    export MAVEN_HOME=/opt/maven3.1.1  
    export PATH=$PATH:$MAVEN_HOME/bin  

    验证配置是否成功: mvn -version

    Apache Maven 3.1.1 (0728685237757ffbf44136acec0402957f723d9a; 2013-09-17 23:22:22+0800)  
    Maven home: /opt/maven3.1.1  
    Java version: 1.7.0_45, vendor: Oracle Corporation  
    Java home: /opt/jdk1.7/jre  
    Default locale: en_US, platform encoding: UTF-8  
    OS name: "linux", version: "2.6.32-358.el6.x86_64", arch: "amd64", family: "unix"  


    3、编译hadoop

    这个地方你将会遇到各式各样的头疼问题

    首先官方下载hadoop源码

    wget http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0-src.tar.gz  

    如果是你32bit的机器,可以直接下载官方已经编译好的包,64bit的机子跑编译好的包跑不了。

    由于maven国外服务器可能连不上,先给maven配置一下国内镜像,在maven目录下,conf/settings.xml,在<mirrors></mirros>里添加,原本的不要动

    <mirror>  
         <id>nexus-osc</id>  
         <mirrorOf>*</mirrorOf>  
         <name>Nexusosc</name>  
         <url>http://maven.oschina.net/content/groups/public<url>
    </mirror>

    同样,在<profiles></profiles>内新添加

    <profile>  
        <id>jdk-1.7</id>  
           <activation>  
             <jdk>1.7</jdk>  
           </activation>  
           <repositories>  
             <repository>  
               <id>nexus</id>  
               <name>local private nexus</name>  
               <url>http://maven.oschina.net/content/groups/public/</url>  
               <releases>  
                 <enabled>true</enabled>  
               </releases>  
               <snapshots>  
                 <enabled>false</enabled>  
               </snapshots>  
             </repository>  
           </repositories>  
           <pluginRepositories>  
             <pluginRepository>  
               <id>nexus</id>  
              <name>local private nexus</name>  
               <url>http://maven.oschina.net/content/groups/public/</url>  
               <releases>  
                 <enabled>true</enabled>  
               </releases>  
               <snapshots>  
                 <enabled>false</enabled>  
               </snapshots>  
             </pluginRepository>  
           </pluginRepositories>  
         </profile>  

    编译clean

    cd hadoop2.2.0-src  
    mvn clean install –DskipTests  


    发现异常
     

    [ERROR] Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:2.2.0:protoc (compile-protoc) on project hadoop-common: org.apache.maven.plugin.MojoExecutionException: 'protoc --version' did not return a version -> [Help 1]  
    [ERROR]   
    [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.  
    [ERROR] Re-run Maven using the -X switch to enable full debug logging.  
    [ERROR]   
    [ERROR] For more information about the errors and possible solutions, please read the following articles:  
    [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException  
    [ERROR]   
    [ERROR] After correcting the problems, you can resume the build with the command  
    [ERROR]   mvn <goals> -rf :hadoop-common  


    hadoop2.2.0编译需要protoc2.5.0的支持,所以还要下载protoc,下载地址:https://code.google.com/p/protobuf/downloads/list,要下载2.5.0版本噢

    对protoc进行编译安装前先要装几个依赖包:gcc,gcc-c++,make 如果已经安装的可以忽略

    yum install gcc  
    yum intall gcc-c++  
    yum install make  

    安装protoc

    1 tar -xvf protobuf-2.5.0.tar.bz2  
    2 cd protobuf-2.5.0  
    3 ./configure --prefix=/opt/protoc/  
    4 make && make install  

    安装完配置下环境变量,就不多说了,跟上面过程一样。

    别急,还不要着急开始编译安装,不然又是各种错误,需要安装cmake,openssl-devel,ncurses-devel依赖

    yum install cmake  
    yum install openssl-devel  
    yum install ncurses-devel  

    目前的2.2.0 的Source Code 压缩包解压出来的code有个bug 需要patch后才能编译。否则编译hadoop-auth 会提示下面错误:

    1 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile (default-testCompile) on project hadoop-auth: Compilation failure: Compilation failure:
    2 [ERROR] /home/chuan/trunk/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/AuthenticatorTestCase.java:[84,13] cannot access org.mortbay.component.AbstractLifeCycle
    3 [ERROR] class file for org.mortbay.component.AbstractLifeCycle not found

    Patch :https://issues.apache.org/jira/browse/HADOOP-10110


    ok,现在可以进行编译了,

    mvn package -Pdist,native -DskipTests -Dtar  


    现在可以拿出你的手机,玩会游戏了,慢慢等吧!

    [INFO] ------------------------------------------------------------------------  
    [INFO] Reactor Summary:  
    [INFO]   
    [INFO] Apache Hadoop Main ................................ SUCCESS [3.709s]  
    [INFO] Apache Hadoop Project POM ......................... SUCCESS [2.229s]  
    [INFO] Apache Hadoop Annotations ......................... SUCCESS [5.270s]  
    [INFO] Apache Hadoop Assemblies .......................... SUCCESS [0.388s]  
    [INFO] Apache Hadoop Project Dist POM .................... SUCCESS [3.485s]  
    [INFO] Apache Hadoop Maven Plugins ....................... SUCCESS [8.655s]  
    [INFO] Apache Hadoop Auth ................................ SUCCESS [7.782s]  
    [INFO] Apache Hadoop Auth Examples ....................... SUCCESS [5.731s]  
    [INFO] Apache Hadoop Common .............................. SUCCESS [1:52.476s]  
    [INFO] Apache Hadoop NFS ................................. SUCCESS [9.935s]  
    [INFO] Apache Hadoop Common Project ...................... SUCCESS [0.110s]  
    [INFO] Apache Hadoop HDFS ................................ SUCCESS [1:58.347s]  
    [INFO] Apache Hadoop HttpFS .............................. SUCCESS [26.915s]  
    [INFO] Apache Hadoop HDFS BookKeeper Journal ............. SUCCESS [17.002s]  
    [INFO] Apache Hadoop HDFS-NFS ............................ SUCCESS [5.292s]  
    [INFO] Apache Hadoop HDFS Project ........................ SUCCESS [0.073s]  
    [INFO] hadoop-yarn ....................................... SUCCESS [0.335s]  
    [INFO] hadoop-yarn-api ................................... SUCCESS [54.478s]  
    [INFO] hadoop-yarn-common ................................ SUCCESS [39.215s]  
    [INFO] hadoop-yarn-server ................................ SUCCESS [0.241s]  
    [INFO] hadoop-yarn-server-common ......................... SUCCESS [15.601s]  
    [INFO] hadoop-yarn-server-nodemanager .................... SUCCESS [21.566s]  
    [INFO] hadoop-yarn-server-web-proxy ...................... SUCCESS [4.754s]  
    [INFO] hadoop-yarn-server-resourcemanager ................ SUCCESS [20.625s]  
    [INFO] hadoop-yarn-server-tests .......................... SUCCESS [0.755s]  
    [INFO] hadoop-yarn-client ................................ SUCCESS [6.748s]  
    [INFO] hadoop-yarn-applications .......................... SUCCESS [0.155s]  
    [INFO] hadoop-yarn-applications-distributedshell ......... SUCCESS [4.661s]  
    [INFO] hadoop-mapreduce-client ........................... SUCCESS [0.160s]  
    [INFO] hadoop-mapreduce-client-core ...................... SUCCESS [36.090s]  
    [INFO] hadoop-yarn-applications-unmanaged-am-launcher .... SUCCESS [2.753s]  
    [INFO] hadoop-yarn-site .................................. SUCCESS [0.151s]  
    [INFO] hadoop-yarn-project ............................... SUCCESS [4.771s]  
    [INFO] hadoop-mapreduce-client-common .................... SUCCESS [24.870s]  
    [INFO] hadoop-mapreduce-client-shuffle ................... SUCCESS [3.812s]  
    [INFO] hadoop-mapreduce-client-app ....................... SUCCESS [15.759s]  
    [INFO] hadoop-mapreduce-client-hs ........................ SUCCESS [6.831s]  
    [INFO] hadoop-mapreduce-client-jobclient ................. SUCCESS [8.126s]  
    [INFO] hadoop-mapreduce-client-hs-plugins ................ SUCCESS [2.320s]  
    [INFO] Apache Hadoop MapReduce Examples .................. SUCCESS [9.596s]  
    [INFO] hadoop-mapreduce .................................. SUCCESS [3.905s]  
    [INFO] Apache Hadoop MapReduce Streaming ................. SUCCESS [7.118s]  
    [INFO] Apache Hadoop Distributed Copy .................... SUCCESS [11.651s]  
    [INFO] Apache Hadoop Archives ............................ SUCCESS [2.671s]  
    [INFO] Apache Hadoop Rumen ............................... SUCCESS [10.038s]  
    [INFO] Apache Hadoop Gridmix ............................. SUCCESS [6.062s]  
    [INFO] Apache Hadoop Data Join ........................... SUCCESS [4.104s]  
    [INFO] Apache Hadoop Extras .............................. SUCCESS [4.210s]  
    [INFO] Apache Hadoop Pipes ............................... SUCCESS [9.419s]  
    [INFO] Apache Hadoop Tools Dist .......................... SUCCESS [2.306s]  
    [INFO] Apache Hadoop Tools ............................... SUCCESS [0.037s]  
    [INFO] Apache Hadoop Distribution ........................ SUCCESS [21.579s]  
    [INFO] Apache Hadoop Client .............................. SUCCESS [7.299s]  
    [INFO] Apache Hadoop Mini-Cluster ........................ SUCCESS [7.347s]  
    [INFO] ------------------------------------------------------------------------  
    [INFO] BUILD SUCCESS  
    [INFO] ------------------------------------------------------------------------  
    [INFO] Total time: 11:53.144s  
    [INFO] Finished at: Fri Nov 22 16:58:32 CST 2013  
    [INFO] Final Memory: 70M/239M  
    [INFO] ------------------------------------------------------------------------  
    

      

     

    直到看到上面的内容那就说明编译完成了。

    编译后的路径在:hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0

    [root@localhost bin]# ./hadoop version
    Hadoop 2.2.0
    Subversion Unknown -r Unknown
    Compiled by root on 2013-11-22T08:47Z
    Compiled with protoc 2.5.0
    From source with checksum 79e53ce7994d1628b240f09af91e1af4
    This command was run using /data/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar

    可以看出hadoop的版本:

    [root@localhost hadoop-2.2.0]# file lib//native/*
    lib//native/libhadoop.a: current ar archive
    lib//native/libhadooppipes.a: current ar archive
    lib//native/libhadoop.so: symbolic link to `libhadoop.so.1.0.0'
    lib//native/libhadoop.so.1.0.0:ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped
    lib//native/libhadooputils.a: current ar archive
    lib//native/libhdfs.a: current ar archive
    lib//native/libhdfs.so: symbolic link to `libhdfs.so.0.0.0'
    lib//native/libhdfs.so.0.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped

    hadoop编译成功,下面可以来部署集群。

    4、部署集群准备

         两台以上机器,修改hostname, ssh免登陆,关闭防火墙等

    4.1、创建新用户和组

    useradd hadoop  添加用户
    groupadd hadoop  添加组
    usermod hadoop(组) hadoop(用户)  将用户添加到组
    

    注意以下操作有些需要root权限

    4.2、修改主机名

     vim /etc/sysconfig/network  
    hostname master    

    注销一下系统logout,登录进去

    [root@master ~]#     

    变成master了,修改生效

    4.3、修改hosts

    vim /etc/hosts   新增你的主机IP和HOSTNAME  
      
    192.168.0.14  master  
    192.168.0.21  slave1
    192.168.0.24  slave2  
    

      

    4.4、ssh免登陆

    查看ssh

    [root@localhost data]# rpm -qa|grep ssh  
    libssh2-1.4.2-1.el6.x86_64  
    openssh-5.3p1-84.1.el6.x86_64  
    openssh-server-5.3p1-84.1.el6.x86_64  

    缺少openssh-clients, 

    yum install openssh-clients    

          

    修改/etc/ssh/sshd_config

    RSAAuthentication yes
    
    PubkeyAuthentication yes
    
    AuthorizedKeysFile      .ssh/authorized_keys

    把这三行注释去掉并保存

    然后service sshd restart  

    现在开始配置无密登录

    [hadoop@master ~]$ cd /home/hadoop/  
    [hadoop@master ~]$ ssh-keygen -t rsa  

    一路回车 

    [hadoop@master ~]$ cd .ssh/  
    [hadoop@master .ssh]$ cp id_rsa.pub authorized_keys  
    [hadoop@master .ssh]$ chmod 600 authorized_keys   

    把authorized_keys复制到其他要无密的机器上

    [hadoop@master.ssh]$scp authorized_keys root@192.168.10.11:/home/hadoop/.ssh/  
     记得这里是以要以root权限过去,不然会报权限错误

    一般情况到这里就可以无密登录了,可是我怎么还是需要密码,经过一翻搜寻才知道这是centos6.4版本的问题,《关于centos ssh无密登录失败的记录

    [hadoop@master .ssh]$ ssh slave1  
    Last login: Mon Nov 25 14:49:25 2013 from master  
    [hadoop@slave1 ~]$    

    看到已经变成slave1了,说明成功鸟

    5、开始集群配置工作
     

    配置之前在要目录下创建三个目录,用来放hadooop文件和日志数据

    [hadoop@master ~]$mkdir -p dfs/name  
    [hadoop@master ~]$mkdir -p dfs/data  
    [hadoop@master ~]$mkdir -p tmp    

    把之前编译成功的版本移到hadoop目录下,注意目录权限问题

    下面就开始配置文件

    5.1  hadoop-env.sh

    找到JAVA_HOME,把路径改为实际地址

    5.2  yarn-env.sh 同5.1

    5.3  slave 配置所有slave节点

    5.4 core-site.xml

    <property>
         <name>fs.defaultFS</name>
          <value>hdfs://master:9000</value>   //系统分布式URL
    </property>
    <property>
         <name>io.file.buffer.size</name>
          <value>131072</value>
    </property>
    <property>
           <name>hadoop.tmp.dir</name>
            <value>file:/home/hadoop/temp</value>
    </property>
    <property>
           <name>hadoop.proxyuser.hadoop.hosts</name>
            <value>*</value>
    </property>
    <property>
         <name>hadoop.proxyuser.hadoop.groups</name>
         <value>*</value>
    </property>
    

     注意fs.defaultFS为2.2.0新的变量,代替旧的:fs.default.name

    5.5、hdfs-site.xml  配置namenode、datanode的本地目录信息

    <property>
           <name>dfs.namenode.secondary.http-address</name>
           <value>master:9001</value>
    </property>
    
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/home/hadoop/dfs/name</value>
    </property>
    
    <property>
          <name>dfs.datanode.data.dir</name>
          <value>/home/hadoop/dfs/data</value></property>
    
    <property>
          <name>dfs.replication</name>
           <value>2</value>           集群节点与这个数字要相符
    </property>
    
    <property>
           <name>dfs.webhdfs.enabled</name>
           <value>true</value>
    </property>
    

    新的:dfs.namenode.name.dir,旧:dfs.name.dir,新:dfs.datanode.name.dir,旧:dfs.data.dir

    5.6、mapred-site.xml 配置其使用 Yarn 框架执行 map-reduce 处理程序这个地方需要把mapred-site.xml.template复制重新命名为mapred-site.xml

    <property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
    </property>
    
    <property>
           <name>mapreduce.jobhistory.address</name>
            <value>master:10020</value>
    </property>
    
    <property>
           <name>mapreduce.jobhistory.webapp.address</name>
           <value>master:19888</value>
    </property>
    

    新的计算框架取消了实体上的jobtracker, 故不需要再指定mapreduce.jobtracker.addres,而是要指定一种框架,这里选择yarn. 备注2:hadoop2.2.还支持第三方的计算框架,但没怎么关注过。
    配置好以后将$HADOOP_HOME下的所有文件,包括hadoop目录分别copy到其它2个节点上。(以上操作都是在主节点上进行的操作)

    5.7、yarn-site.xml 配置ResourceManager,NodeManager的通信端口,WEB监控端口等

    <property>
         <name>yarn.nodemanager.aux-services</name>
          <value>mapreduce_shuffle</value>
    </property>
    
    <property>
        <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>master:8032</value>
    </property>
    
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
         <value>master:8030</value>
    </property>
    
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
         <value>master:8031</value>
    </property>
    
    <property>
         <name>yarn.resourcemanager.admin.address</name>
         <value>master:8033</value>
    </property>
    
    <property>
         <name>yarn.resourcemanager.webapp.address</name>
          <value>master:8088</value>
    </property>
    
    <property>
        <name>yarn.nodemanager.resource.memory-mb</name> //配置内存
        <value>15360</value>
    </property>
    
    6、启动hadoop 这里你可以进行环境变量设置,不举例了

     6.1、格式化namenode

    [hadoop@master bin]$ ./hdfs namenode  -format 
    

     6.2、启动hdfs

    start-dfs.sh && start-yarn.sh
    

    这时候在master中输入jps应该看到namenode和secondarynamenode、ResourceManager服务启动,slave中看到datanode、nodemanager服务启动

    查看集群状态:./bin/hdfs dfsadmin –report

    查看文件块组成:  ./bin/hdfsfsck / -files -blocks

    查看各节点状态: http://192.168.0.14(或设置的主机名master):50070

    查看resourcemanager上cluster运行状态:http://192.168.0.14(或设置的主机名master):8088

    启动一切正常就说明hadoop 配置安装成功,可以进行例子测试(例子测试在另外一篇文章)

  • 相关阅读:
    14_java之变量|参数|返回值|修饰符
    NYOJ 202 红黑树 (二叉树)
    NYOJ 138 找球号(二) (哈希)
    NYOJ 136 等式 (哈希)
    NYOJ 133 子序列 (离散化)
    NYOJ 129 树的判定 (并查集)
    NYOJ 117 求逆序数 (树状数组)
    NYOJ 93 汉诺塔 (数学)
    HDU 2050 折线分割平面 (数学)
    天梯赛L2-008 最长对称子串 (字符串处理)
  • 原文地址:https://www.cnblogs.com/fuxulook/p/3584608.html
Copyright © 2011-2022 走看看