zoukankan      html  css  js  c++  java
  • Linux_hadoop_install

    1、 Build Linux env

      my env is VM RedHat Linux 6.5 64bit
        set fixed IP    
                  vim /etc/sysconfig/network-scripts/ifcfg-eth0

                  set IP to : 192.168.38.128

      modify hostname: vim /etc/hosts

                  set hostname to :  itbuilder1

    2、install JDK

        config JDK env variables

    3、install Hadoop env

        download Apache hadoop pkg

        addr:http://archive.apache.org/dist/hadoop/core/stable2/hadoop-2.7.1.tar.gz

        3.1  Extract the package to the specified directory

            create a dir : mkdir /usr/local/hadoop

            extract file to dir : /usr/local/hadoop :tar -zxvf hadoop-2.7.1.tar.gz -C /usr/local/hadoop

        3.2 Modify the configuration file

              hadoop2.7.1 version need to modify 5 config files :

                1、hadoop-env.sh

                2、core-site.xml

                3、hdfs-site.xml

                4、mapred-site.xml(mapred-site.xml.template)

                5、yarn-site.xml

            these file all under etc of hadoop, the detail dir is : /usr/local/hadoop/hadoop-2.7.1/etc/hadoop/

          3.2.1 Modfiy env variable (hadoop-env.sh)

                vim hadoop-env.sh

                set up JDK root directory, as shown below:

        export JAVA_HOME=/usr/java/jdk1.8.0_20

          3.2.2  core-site.xml ,set namenode and temp file addr of HDFS.

              <configuration>
            <!--set HDFS addr (NameNode) -->
                <property> 
                        <name>fs.defaultFS</name>
                        <value>hdfs://itbuilder1:9000</value>
                </property>
            <!--set dir of Hadoop runtime file storage directory-->
                <property> 
                        <name>hadoop.tmp.dir</name>
                        <value>/usr/local/hadoop/hadoop-2.7.1/tmp</value>
                </property> 
          </configuration>

          3.2.3 hdfs-site.xml (set duplicate quantity)

            <configuration>
              <property> 
                    <name>dfs.replication</name>
                    <value>1</value>
                </property>
            </configuration>

            3.2.4 mapred-site.xml  ( tell hadoop that later MR runs on yarn )

              <configuration>
                    <property>
                          <name>mapreduce.framework.name</name>
                            <value>yarn</value>
                      </property>
                </configuration>

            3.2.5 yarn-site.xml

                <configuration>
                      <!-- tell nodemanager the way to get data is shuffle -->
                      <property> 
                                  <name>yarn.nodemanager.aux-services</name>
                                    <value>mapreduce_shuffle</value>
                        </property>

                        <!--set yarn addr (ResourceManager) -->
                        <property>
                                  <name>yarn.resourcemanager.hostname</name>
                                    <value>itbuilder1</value>
                        </property>

                </configuration>

    4、add hadoop to env variable

    vim /etc/profile

    export JAVA_HOME=/usr/java/jdk1.8.0_20
    export HADOOP_HOME=/usr/local/hadoop/hadoop-2.7.1
    export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin

    #refresh /etc/profile
     source /etc/profile

    5、Initialize (format) file system (HDFS)
        #hadoop namenode -format 
        hdfs namenode -format  

    6、start hadoop (hdfs yarn)
    ./start-all.sh (need to input linux password)
    ./start-hdfs.sh
    ./start-yarn.sh

    View the current process of opening by JPs command

    [root@linuxidc ~]# jps
    3461 ResourceManager
    3142 DataNode
    3751 NodeManager
    3016 NameNode
    5034 Jps
    3307 SecondaryNameNode

    Access the management interface :
    http://192.168.38.128:50070 (hdfs management interface)
    http://192.168.38.128:8088 (mr management interface)

  • 相关阅读:
    开源工作流Fireflow源码分析之运行流程二
    沿线批量内插点对象
    shapefile数据无法正常浏览的问题
    InMemeryWorkspace的效率测试结果
    Oracle数据库SQL语句性能调整的基本原则[转存]
    <转>arcgis server部署 自己安装的体会
    AO中保存二进制大对象(BLOB)
    How to create new geodatabases
    使用C#向Excel中写数据
    oracle数据库的sde数据文件迁移操作方法
  • 原文地址:https://www.cnblogs.com/liupuLearning/p/6265430.html
Copyright © 2011-2022 走看看