zoukankan      html  css  js  c++  java
  • Centos 6.5 hadoop 2.2.0 全分布式安装



    hadoop 2.2.0 cluster setup

    环境:

    操作系统:Centos 6.5
    jdk:jdk1.7.0_51
    hadoop版本:2.2.0

    hostname    ip

    master 192.168.1.180

    slave1 192.168.1.181

    slave2 192.168.1.182

    slave3 192.168.1.183

    一、前期系统环境配置

    设置主机名
    临时生效修改主机名,重启失效

    [lxj@master ~]$ hostnamemaster

    永久生效修改主机名,需重启.

    [lxj@master ~]$ vim/etc/sysconfig/network

     

    NETWORKING=yes

    HOSTNAME=master

    2.设置ip

    [lxj@master ~]$ vim/etc/sysconfig/network-scripts/ifcfg-eth0

     

    DEVICE=eth0

    TYPE=Ethernet

    UUID=390e4922-0d95-4c34-9dde-897fb8acef0f

    ONBOOT=yes

    NM_CONTROLLED=yes

    BOOTPROTO=none

    HWADDR=08:00:27:1C:57:24

    IPADDR=192.168.1.180

    PREFIX=24

    GATEWAY=192.168.1.1

    DNS1=192.168.1.1

    DOMAIN=114.114.114.114

    DEFROUTE=yes

    IPV4_FAILURE_FATAL=yes

    IPV6INIT=no

    NAME="Systemeth0"

    LAST_CONNECT=1393996666

    3.host 映射

    [lxj@master ~]$ vim/etc/hosts

     

    127.0.0.1   localhost

    192.168.1.180 master

    192.168.1.181 slave1

    192.168.1.182 slave2

    192.168.1.183 slave3

    其他节点hosts改成这样。

    4.关闭防火墙
    开机启动关闭

    [lxj@master ~]$ sudoservice iptables stop

    [lxj@master ~]$ sudochkconfig iptables off

    5.ssh 无密码登录,master 需要启动slave节点的相关进程

    [lxj@master ~]$ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

    [lxj@master ~]$ cat~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

    测试是否能无密码登录了

    [lxj@master ~]$ sshlocalhost

    如果以上操作完成后 还需要输入密码 请做一下操作
    /home/用户名/.ssh 目录权限 700
    /home/用户名/.ssh/authorized_keys 文件权限 600

    [lxj@master ~]$ chmod 700/home/lxj/.ssh

    [lxj@master ~]$ chmod 600/home/lxj/.ssh/authorized_keys

    6.拷贝公钥到其他slave节点
    由于master主机公钥已写入authorized_keys授权文件,所以只要拷贝授权文件到其他slave节点即可

    注意节点.ssh目录和authorized_keys文件权限也分别要是700,600。

    [lxj@master ~]$ scp/home/lxj/.ssh/authorized_keys lxj@slave1:/home/lxj/.ssh/

    [lxj@master ~]$ scp/home/lxj/.ssh/authorized_keys lxj@slave2:/home/lxj/.ssh/

    [lxj@master ~]$ scp/home/lxj/.ssh/authorized_keys lxj@slave3:/home/lxj/.ssh/

    测试是否可以无密码登录slave节点了

    [lxj@master ~]$ sshslave1

    [lxj@master ~]$ exit

    [lxj@master ~]$ sshslave2

    [lxj@master ~]$ exit

    [lxj@master ~]$ sshslave3

    可以的话,这样master无密码登录其他slave节点就配配置好了!

    二、hadoop 配置文件

    先设置下环境变量

    JAVA_HOME=/home/lxj/jdk1.7.0_51

    CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

     

    HADOOP_HOME=/home/lxj/hadoop-2.2.0

    HADOOP_MAPRED_HOME=$HADOOP_HOME

    HADOOP_COMMON_HOME=$HADOOP_HOME

    HADOOP_HDFS_HOME=$HADOOP_HOME

    YARN_HOME=$HADOOP_HOME

    HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

    YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

     

    PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

     

    export JAVA_HOME CLASSPATHHADOOP_HOME HADOOP_MAPRED_HOME HADOOP_COMMON_HOME HADOOP_HDFS_HOME YARN_HOMEHADOOP_CONF_DIR YARN_CONF_DIR  PATH

    主要需修改的配置文件
    我们先在master机器上修改这些配置文件,然后传到各个节点hadoop 目录在~/hadoop-2.2.0

    hadoop-env.sh、core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-env.sh、yarn-site.xml、masters、slaves

    以上配置文件,如果没有则自己新建

    节点信息
    NameNode: master,slave1
    SecondaryNameNode: master,slave1
    DataNode:slave1,slave2,slave3

    三、master 机器配置

    1.hadoop-env.sh 修改JAVA_HOME

    [lxj@master ~]$  cd ~/hadoop-2.2.0/etc/hadoop

    [lxj@master hadoop]$ vimhadoop-env.sh

    # The onlyrequired environment variable is JAVA_HOME. All others are

    #optional.  When running a distributedconfiguration it is best to

    # setJAVA_HOME in this file, so that it is correctly defined on

    # remotenodes.

     

    # The javaimplementation to use.

    export JAVA_HOME=/home/lxj/jdk-1.7.0_15

    2.core-site.xml

    <?xmlversion="1.0" encoding="UTF-8"?>

    <?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

     

    <configuration>

        <property>

          <name>fs.default.name</name>

           <value>hdfs://master:9000</value>

        </property>

        <property>

           <name>hadoop.tmp.dir</name>

           <value>/home/lxj/hadoop/tmp</value>

        </property>

    </configuration>

    3.hdfs-site.xml

    <?xmlversion="1.0" encoding="UTF-8"?>

    <?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

    <!--Put site-specific property overrides in this file. -->

     

    <configuration>

        <property>

            <name>dfs.namenode.secondary.http-address</name>

            <value>master:50090</value>

        </property>

        <property>

            <name>fs.checkpoint.period</name>

            <value>600</value>

        </property>

        <property>

            <name>fs.checkpoint.size</name>

            <value>67108894</value>

        </property>

        <property>

            <name>dfs.namenode.name.dir</name>

            <value>file:/home/lxj/hadoop/hdfs/namenode</value>

        </property>

        <property>

            <name>dfs.datanode.data.dir</name>

            <value>file:/home/lxj/hadoop/hdfs/datanode</value>

        </property>

        <property>

            <name>dfs.permissions</name>

            <value>false</value>

        </property>

        <property>

            <name>dfs.replication</name>

            <value>3</value>

        </property>

        <property>

            <name>dfs.webhdfs.enabled</name>

            <value>true</value>

        </property>

    </configuration>

    4.mapred-site.xml

    <?xmlversion="1.0"?>

    <?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

     

    <configuration>

        <property>

            <name>mapreduce.framework.name</name>

            <value>yarn</value>

        </property>

        <property>

            <name>mapred.job.tracker</name>

            <value>master:9001</value>

        </property>

        <property>

            <name>mapreduce.jobhistory.address</name>

            <value>master:10020</value>

        </property>

        <property>

            <name>mapreduce.jobhistory.webapp.address</name>

            <value>master:19888</value>

        </property>

    </configuration>

    5.yarn-env.sh 修改JAVA_HOME

    # someJava parameters

    export JAVA_HOME=/home/lxj/jdk1.7.0_51

    if["$JAVA_HOME" !=""];then

      #echo "run java in$JAVA_HOME"

      JAVA_HOME=$JAVA_HOME

    fi

    6.yarn-site.xml

    <?xmlversion="1.0"?>

     

    <configuration>

            <property>

                <name>yarn.nodemanager.aux-services</name>

                <value>mapreduce_shuffle</value>

            </property>

            <property>

                <name>yarn.nodemanager.aux-servicex.mapreduce.shuffle.class</name>

                <value>org.apache.hadoop.mapred.ShuffleHandler</value>

            </property>

            <property>

                <name>yarn.resourcemanager.address</name>

                <value>master:8032</value>

            </property>

            <property>

                <name>yarn.resourcemanager.scheduler.address</name>

                <value>master:8030</value>

            </property>

            <property>

                <name>yarn.resourcemanager.resource-tracker.address</name>

                <value>master:8031</value>

            </property>

            <property>

                <name>yarn.resourcemanager.admin.address</name>

                <value>master:8033</value>

            </property>

            <property>

                <name>yarn.resourcemanager.webapp.address</name>

                <value>master:8088</value>

            </property>

    </configuration>

    7.masters 添加master主机 多个换行写

    master

    8.slaves 添加slave节点 当加入master时,master即当作NameNode又当作DataNode节点

    slave1 

    slave2

    slave3

    四、slave 节点配置

    其他节点各拷贝一份。

    [lxj@master ~]$ scp -r~/hadoop-2.2.0 lxj@slave1:/home/lxj/

    [lxj@master ~]$ scp -r~/hadoop-2.2.0 lxj@slave2:/home/lxj/

    [lxj@master ~]$ scp -r~/hadoop-2.2.0 lxj@slave3:/home/lxj/

    slave节点的hadoop路径需和master的路径一样,当然系统环境需要保持一致。
    只要将master主机的配置好的文件拷贝到各个节点即可。

    五、 hadoop 启动

    1.格式化一个新的分布式文件系统
    需要在master和slave1都要执行

    [lxj@master ~]$ hdfsnamenode -format

    2.启动hdfs和yarn 只需在master执行

    [lxj@master ~]$start-dfs.sh

    [lxj@master ~]$start-yarn.sh

    [lxj@master ~]$mr-jobhistory-daemon.sh start historyserver

    或者

    [lxj@mater ~]$start-all.sh

    start-all.sh已废弃,不推荐使用!
    3.测试
    master 进程

    [lxj@master ~]$ jps

    11597 JobHistoryServer

    4529 Jps

    4460 ResourceManager

    4293 NameNode

    4377 SecondaryNameNode

    slaves进程

    [lxj@master ~]$ jps

    1941 DataNode

    2176 Jps

    2046 NodeManager

    创建输入目录和文件

    [lxj@master ~]$ hdfs dfs-mkdir /input

    [lxj@master ~]$ vim./test

    hello hadoop test

    拷贝文件到hdfs文系统,运行wordcount程序计算输入文件的单词出现次数

    [lxj@master ~]$ hdfs dfs-copyFromLocal ./test /input

    [lxj@master ~]$ hadoopjar/home/lxj/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jarwordcount /input/ /out

    [lxj@master ~]$ hdfs dfs-cat /out/*

    hadoop  1

    hello   1

    test    1

    搞了下ha federationqjm 配置,没起来,还是从基础配置开始吧,以后在扩展。

    其他主题:eclipse安装hadoop-2.2.0插件

     

  • 相关阅读:
    apache 访问出现403 Forbidden
    linux下用Apache一个IP多个域名建虚拟主机
    利用xargs 可以一次性卸载到位
    linux安装包地址备忘
    ii7安装php
    基于jQuery的对象切换插件:soChange 1.5 (点击下载)
    phpMyAdmin 缺少 mysqli 扩展。请检查 PHP 配置
    Android中的自动朗读(TTS)
    Android中的手势
    Android中的SQLiteOpenHelper类
  • 原文地址:https://www.cnblogs.com/jamesf/p/4751511.html
Copyright © 2011-2022 走看看