zoukankan      html  css  js  c++  java
  • 在Ubuntu 13.10 中安装配置 Hadoop 2.2.0

    预备条件:

    1. 已安装JDK

    Add Hadoop Group and User

    $ sudo addgroup hadoop
    $ sudo adduser --ingroup hadoop hduser
    $ sudo adduser hduser sudo
     
     
    切换到hduser账户下操作
     
     
    SSH-server 安装
    $ sudo apt-get install openssh-server
     

    Setup SSH Certificate

     
    $ ssh-keygen -t rsa -P ''
    ...
    Your identification has been saved in /home/hduser/.ssh/id_rsa.
    Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
    ...
    $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    $ ssh localhost
     
     
     
    Hadoop安装
     

    Download Hadoop 2.2.0

     
    $ cd ~
    $ wget http://www.trieuvan.com/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz
    $ sudo tar vxzf hadoop-2.2.0.tar.gz -C /usr/local
    $ cd /usr/local
    $ sudo mv hadoop-2.2.0 hadoop
    $ sudo chown -R hduser:hadoop hadoop
     

    Setup Hadoop Environment Variables

    $cd ~
    $sudo gedit .bashrc
     
    paste following to the end of the file
     
    #Hadoop variables
    export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45
    export HADOOP_INSTALL=/usr/local/hadoop
    export PATH=$PATH:$HADOOP_INSTALL/bin
    export PATH=$PATH:$HADOOP_INSTALL/sbin
    export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
    export HADOOP_COMMON_HOME=$HADOOP_INSTALL
    export HADOOP_HDFS_HOME=$HADOOP_INSTALL
    export YARN_HOME=$HADOOP_INSTALL
    ###end of paste
     
    $ cd /usr/local/hadoop/etc/hadoop
    $ vi hadoop-env.sh
     
    #modify JAVA_HOME
    export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45
     
    Re-login into Ubuntu using hdser and check hadoop version
    $ hadoop version
    Hadoop 2.2.0
    Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768
    Compiled by hortonmu on 2013-10-07T06:28Z
    Compiled with protoc 2.5.0
    From source with checksum 79e53ce7994d1628b240f09af91e1af4
    This command was run using /usr/local/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar
     
    At this point, hadoop is installed.
     
     
    根据上面步骤,制作自动安装脚本如下:

    #!/bin/bash

    # 该脚本将在单个节点上安装 Hadoop 2.2.0


    cd ~
    sudo apt-get update

    #### HADOOP 安装 ###

    # Download java jdk
    #if [ ! -f jdk-7u45-linux-i586.tar.gz ]; then
    #    wget http://uni-smr.ac.ru/archive/dev/java/SDKs/sun/j2se/7/jdk-7u45-linux-i586.tar.gz
    #fi
    #sudo mkdir /usr/lib/jvm
    #sudo tar zxvf jdk-7u45-linux-i586.tar.gz  -C /usr/lib/jvm

    # 再修改环境变量
    #sudo sh -c 'echo export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45 >> ~/.bashrc'
    #sudo sh -c 'echo export JRE_HOME=${JAVA_HOME}/jre >> ~/.bashrc'
    #sudo sh -c 'echo export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib >> ~/.bashrc'
    #sudo sh -c 'echo export PATH=${JAVA_HOME}/bin:$PATH >> ~/.bashrc'

    # 使之立即生效
    #source ~/.bashrc

    # 配置默认JDK版本
    #sudo update-alternatives --install /usr/bin/java java /usr/lib/jvm/jdk1.7.0_45/bin/java 300
    #sudo update-alternatives --install /usr/bin/javac javac /usr/lib/jvm/jdk1.7.0_45/bin/javac 300
    #sudo update-alternatives --install /usr/bin/jar jar /usr/lib/jvm/jdk1.7.0_45/bin/jar 300
    #sudo update-alternatives --install /usr/bin/javah javah /usr/lib/jvm/jdk1.7.0_45/bin/javah 300
    #sudo update-alternatives --install /usr/bin/javap javap /usr/lib/jvm/jdk1.7.0_45/bin/javap 300
    # 查看设置是否成功
    #sudo update-alternatives --config java

    #查看JDK版本,判断是否安装成功
    #java -version

    # 专门添加hadoop用户用来做hadoop计算
    #sudo addgroup hadoop
    #sudo adduser --ingroup hadoop hduser
    #sudo adduser hduser sudo

    # 安装openssh-server
    sudo apt-get install openssh-server


    # 产生ssh key 并添加到授权文件
    sudo -u hduser ssh-keygen -t rsa -P ''
    sudo sh -c 'cat /home/hduser/.ssh/id_rsa.pub >> /home/hduser/.ssh/authorized_keys'
    #ssh localhost

    # 下载安装Hadoop,并更改文件夹权限
    cd ~
    if [ ! -f hadoop-2.2.0.tar.gz ]; then
        #wget http://apache.osuosl.org/hadoop/common/hadoop-2.2.0.tar.gz
        wget http://xx.xx.xx.xx/downloads/hadoop-2.2.0.tar.gz
    fi
    sudo tar vxzf hadoop-2.2.0.tar.gz -C /usr/local
    cd /usr/local
    sudo mv hadoop-2.2.0 hadoop
    sudo chown -R hduser:hadoop hadoop

    # 将Hadoop 环境变量添加到.bashrc
    cd ~cd ~
    sudo sh -c "echo export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45 >> /home/hduser/.bashrc"
    sudo sh -c 'echo export HADOOP_INSTALL=/usr/local/hadoop >> /home/hduser/.bashrc'
    sudo sh -c 'echo export PATH=$PATH:$JAVA_HOME/bin >> /home/hduser/.bashrc'
    sudo sh -c 'echo export PATH=$PATH:$HADOOP_INSTALL/bin >> /home/hduser/.bashrc'
    sudo sh -c 'echo export PATH=$PATH:$HADOOP_INSTALL/sbin >> /home/hduser/.bashrc'
    sudo sh -c 'echo export HADOOP_MAPRED_HOME=$HADOOP_INSTALL >> /home/hduser/.bashrc'
    sudo sh -c 'echo export HADOOP_COMMON_HOME=$HADOOP_INSTALL >> /home/hduser/.bashrc'
    sudo sh -c 'echo export HADOOP_HDFS_HOME=$HADOOP_INSTALL >> /home/hduser/.bashrc'
    sudo sh -c 'echo export YARN_HOME=$HADOOP_INSTALL >> /home/hduser/.bashrc'
    sudo sh -c 'echo export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_INSTALL}/lib/native >> /home/hduser/.bashrc'
    sudo sh -c 'echo export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib" >> /home/hduser/.bashrc'


    # 修改Hadoop 的JAVA_HOME,在hadoop-env.sh 文件中
    cd /usr/local/hadoop/etc/hadoop
    sudo -u hduser sed -i.bak s=${JAVA_HOME}=/usr/lib/jvm/jdk1.7.0_45/=g hadoop-env.sh
    pwd

    # Check that Hadoop is installed
    /usr/local/hadoop/bin/hadoop version

    # Edit configuration files
    sudo -u hduser sed -i.bak 's=<configuration>=<configuration><property><name>fs.default.name</name><value>hdfs://localhost:9000</value></property>=g' core-site.xml
    sudo -u hduser sed -i.bak 's=<configuration>=<configuration><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property>=g' yarn-site.xml
     
    sudo -u hduser cp mapred-site.xml.template mapred-site.xml
    sudo -u hduser sed -i.bak 's=<configuration>=<configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property>=g' mapred-site.xml
     
    cd ~
    sudo mkdir -p mydata/hdfs/namenode
    sudo mkdir -p mydata/hdfs/datanode
    sudo chown -R hduser:hadoop mydata

    cd /usr/local/hadoop/etc/hadoop
    sudo -u hduser sed -i.bak 's=<configuration>=<configuration><property><name>dfs.replication</name><value>1</value></property><property><name>dfs.namenode.name.dir</name><value>file:/home/hduser/mydata/hdfs/namenode</value></property><property><name>dfs.datanode.data.dir</name><value>file:/home/hduser/mydata/hdfs/datanode</value></property>=g' hdfs-site.xml




    ### Testing Hadoop

    ## Run the following commands as hduser to start and test hadoop
    #sudo su hduser
    # Format Namenode
    #hdfs namenode -format
    # Start Hadoop Service
    #start-dfs.sh
    #start-yarn.sh
    # Check status
    #hduser jps
    # Example
    #cd /usr/local/hadoop
    #hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5

    集群设置:

    禁用IPV6

    编辑文件/etc/sysctl.conf,在最后插入

    net.ipv6.conf.all.disable_ipv6 = 1

    net.ipv6.conf.default.disable_ipv6 = 1

    net.ipv6.conf.lo.disable_ipv6 = 1

    重启生效

    cat /proc/sys/net/ipv6/conf/all/disable_ipv6

    若返回0则IPV6生效,返回1则说明没有生效

    配置 hosts文件

    将以下条目加入到 master以及每个slave的 /etc/hosts文件

    xxx.xxx.xxx.xx0 master
    xxx.xxx.xxx.xx1 b-w01
    xxx.xxx.xxx.xx2 b-w02
    xxx.xxx.xxx.xx3 b-w03
    xxx.xxx.xxx.xx4 b-w04

    配置ssh,以使master可以无密码连接slaves

    ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@slaveXXXX

    在弹出密码提示时输入slaves的密码

    然后再测试 ssh slave,即可看到成功登陆的消息

    配置文件:slaves

    b-w01
    b-w02
    b-w03
    b-w04

    配置文件core-site.XML

    <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://master:9000</value>
    </property>
    <property>
    <name>io.file.buffer.size</name>
    <value>131072</value>
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hduser/mydata/hadoop_temp</value>
    </property>

    </configuration>

    配置文件:hdfs-site.xml


    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>2</value>
    </property>
    <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/home/hduser/mydata/hdfs/namenode</value>
    </property>
    <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/home/hduser/mydata/hdfs/datanode</value>
    </property>
    <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
    </property>
    </configuration>

    配置文件:mapred-site.xml

     
    <configuration>
    <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    </property>
    <property>
    <name>mapreduce.jobhistory.address</name>
    <value>master:10020</value>
    </property>
    <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>master:19888</value>
    </property>
    </configuration>

    配置文件:yarn-site.xml

    <configuration>
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>
    <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>

    <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>

    <property>
      <name>yarn.resourcemanager.hostname</name>
      <value>master</value>
    </property>

    </configuration>

    最后通过脚本把上述配置文件迁移到每个slave

    #!/bin/bash
    scp /usr/local/hadoop/etc/hadoop/slaves hduser@b-w01:/usr/local/hadoop/etc/hadoop/slaves
    scp /usr/local/hadoop/etc/hadoop/slaves hduser@b-w02:/usr/local/hadoop/etc/hadoop/slaves
    scp /usr/local/hadoop/etc/hadoop/slaves hduser@b-w03:/usr/local/hadoop/etc/hadoop/slaves
    scp /usr/local/hadoop/etc/hadoop/slaves hduser@b-w04:/usr/local/hadoop/etc/hadoop/slaves

    scp /usr/local/hadoop/etc/hadoop/core-site.xml hduser@b-w01:/usr/local/hadoop/etc/hadoop/core-site.xml
    scp /usr/local/hadoop/etc/hadoop/core-site.xml hduser@b-w02:/usr/local/hadoop/etc/hadoop/core-site.xml
    scp /usr/local/hadoop/etc/hadoop/core-site.xml hduser@b-w03:/usr/local/hadoop/etc/hadoop/core-site.xml
    scp /usr/local/hadoop/etc/hadoop/core-site.xml hduser@b-w04:/usr/local/hadoop/etc/hadoop/core-site.xml

    scp /usr/local/hadoop/etc/hadoop/hdfs-site.xml hduser@b-w01:/usr/local/hadoop/etc/hadoop/hdfs-site.xml
    scp /usr/local/hadoop/etc/hadoop/hdfs-site.xml hduser@b-w02:/usr/local/hadoop/etc/hadoop/hdfs-site.xml
    scp /usr/local/hadoop/etc/hadoop/hdfs-site.xml hduser@b-w03:/usr/local/hadoop/etc/hadoop/hdfs-site.xml
    scp /usr/local/hadoop/etc/hadoop/hdfs-site.xml hduser@b-w04:/usr/local/hadoop/etc/hadoop/hdfs-site.xml

    scp /usr/local/hadoop/etc/hadoop/mapred-site.xml hduser@b-w01:/usr/local/hadoop/etc/hadoop/mapred-site.xml
    scp /usr/local/hadoop/etc/hadoop/mapred-site.xml hduser@b-w02:/usr/local/hadoop/etc/hadoop/mapred-site.xml
    scp /usr/local/hadoop/etc/hadoop/mapred-site.xml hduser@b-w03:/usr/local/hadoop/etc/hadoop/mapred-site.xml
    scp /usr/local/hadoop/etc/hadoop/mapred-site.xml hduser@b-w04:/usr/local/hadoop/etc/hadoop/mapred-site.xml

    scp /usr/local/hadoop/etc/hadoop/yarn-site.xml hduser@b-w01:/usr/local/hadoop/etc/hadoop/yarn-site.xml
    scp /usr/local/hadoop/etc/hadoop/yarn-site.xml hduser@b-w02:/usr/local/hadoop/etc/hadoop/yarn-site.xml
    scp /usr/local/hadoop/etc/hadoop/yarn-site.xml hduser@b-w03:/usr/local/hadoop/etc/hadoop/yarn-site.xml
    scp /usr/local/hadoop/etc/hadoop/yarn-site.xml hduser@b-w04:/usr/local/hadoop/etc/hadoop/yarn-site.xml

    查看JOBMANAGER:

    http://master:8088

    查看namenode:

    http://master:50070/

    测试:

    cd /usr/local/hadoop

    hadoop fs -mkdir /output

    hadoop fs -mkdir /input

    hadoop dfs -put ~/wordCounterTest.txt /input/wordCounterTest.txt

     hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /input/wordCounterTest.txt /output/wordcountresult

  • 相关阅读:
    MagicZoom bug-Strict Standards: Only variables should be assigned by reference Error
    Web大文件(夹)上传(断点续传)控件-Xproer.HttpUploader6
    在PHP中,通过filesize函数可以取得文件的大小,文件大小是以字节数表示的。如果要转换文件大小的单位,可以自己定义函数来实现。
    PHP正则匹配6到16位字符组合(且只能为数字、字母、下划线)
    Windows下PHP版本选取
    简单配置nginx使之支持pathinfo
    PHP如何关闭notice级别的错误提示
    php开启pathinfo 模式
    php 5.3新增的闭包语法介绍function() use() {}
    Object.prototype.toString.call() 区分对象类型
  • 原文地址:https://www.cnblogs.com/zztt/p/3589939.html
Copyright © 2011-2022 走看看