zoukankan      html  css  js  c++  java
  • hadoop搭建部署

    HDFS(Hadoop Distributed File System)和Mapreduce是hadoop的两大核心:

    HDFS(文件系统)实现分布式存储的底层支持

    Mapreduce(编程模型)实现分布式并行任务处理的程序支持

    JobTracker   对应于 NameNode

    TaskTracker 对应于 DataNode

    DataNode和NameNode   是针对数据存放来而言的

    JobTracker和TaskTracker是对于MapReduce执行而言的

    从官网下载安装包:

    wget  http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz

    JDK安装和ssh免密码等此处不再讲述

    hadoop环境变量配置:

    vim /etc/profile.d/hadoop.sh 

    HADOOP_HOME=/usr/local/hadoop
    HADOOP_HEAPSIZE=2048
    HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
    PATH=$HADOOP_HOME/bin:$PATH
    HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib/native

    然后主要配置下面5个配置文件:

    core-site.xml

    hdfs-site.xml

    mapred-site.xml

    yarn-site.xml

    slave

    以上各配置文件的各项参数默认值:

    http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml

    http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

    http://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

    http://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

    vim core-site.xml 在<configuration>处添加以下部分

    <configuration>
         <property>
             <name>fs.defaultFS</name>
             <value>hdfs://dataMaster30:9000</value>
         </property>
         <property>
             <name>hadoop.tmp.dir</name>
             <value>file:/usr/local/hadoop/tmp</value>
            <description>Abase for other temporary directories.</description>
         </property>
     <property>
            <name>io.file.buffer.size</name>
            <value>131702</value>
        </property> </configuration>

    vim hdfs-site.xml

    <configuration>
           <property>
                    <name>dfs.namenode.secondary.http-address</name>
                    <value>dataMaster30:9001</value>
            </property>
            <property>
                   <name>dfs.replication</name>
                   <value>3</value>
            </property>
            <property>
                   <name>dfs.blocksize</name>
                   <value>512m</value>
            </property>
            <property>
                  <name>dfs.namenode.name.dir</name>
                  <value>file:/data/hadoop/name</value>
            </property>
            <property>
                 <name>dfs.datanode.data.dir</name> 
                 <value>file:/data/hadoop/hdfs</value>
           </property>
           <property>
                 <name>dfs.webhdfs.enabled</name>
                 <value>true</value>
           </property>  
    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property>
    </configuration>

    vim mapred-site.xml

    <configuration>
            <property>
                    <name>mapreduce.framework.name</name>
                    <value>yarn</value>
            </property>
            <property>
                    <name>mapreduce.jobhistory.address</name>
                    <value>dataMaster30:10020</value>
            </property>
            <property>
                    <name>mapreduce.jobhistory.webapp.address</name>
                    <value>dataMaster30:19888</value>
            </property>
    
            <property>
                    <name>mapreduce.map.memory.mb</name>
                    <value>2048</value>
                    <description>每个Map任务的物理内存限制</description>
            </property>
    
            <property>
                    <name>mapreduce.reduce.memory.mb</name>
                    <value>2048</value>
                    <description>每个Reduce任务的物理内存限制</description>
            </property>
    </configuration>

    vim yarn-site.xml

    <configuration>
            <property>
                     <name>yarn.resourcemanager.hostname</name>
                     <value>dataMaster30</value>
            </property>
            <property>
                     <name>yarn.nodemanager.aux-services</name>
                     <value>mapreduce_shuffle</value>
            </property>
            <property>
                    <name>yarn.nodemanager.resource.memory-mb</name>
                    <value>65366</value>
                    <discription>每个节点可用内存,单位MB</discription>
            </property>
      
            <property>
                    <name>yarn.scheduler.minimum-allocation-mb</name>
                    <value>2048</value>
                    <discription>单个任务可申请最少内存,默认1024MB</discription>
            </property>
      
            <property>
                    <name>yarn.scheduler.maximum-allocation-mb</name>
                    <value>16384</value>
                    <discription>单个任务可申请最大内存,默认8192MB</discription>
            </property>
             <property>
                    <name>yarn.nodemanager.resource.cpu-vcores</name>
                    <value>16</value>
                    <discription>cpu</discription>
            </property>
    </configuration>

     vim slave

    #localhost
    dataSlave31 dataSlave32 dataSlave33 dataSlave34 dataSlave35

    完成后,将配置好的Hadoop目录分发到各个slave节点对应位置上

    在Master节点服务器启动hadoop集群,从节点会自动启动,进入hadoop目录
    (1)初始化,格式化Hadoop。输入命令,bin/hdfs namenode -format
    (2)全部启动sbin/start-all.sh,也可以分开sbin/start-dfs.sh、sbin/start-yarn.sh
    (3)停止的话,输入命令,sbin/stop-all.sh
    (4)输入命令,jps,可以看到相关进程信息,从而进行验证是否启动成功。

    如果输入jps出现process information unavailable提示时,这时可以进于是/tmp目录下,删除名称为hsperfdata_{username}的文件夹,然后重新启动Hadoop即可。

    # jps (主节点)

    1701 SecondaryNameNode
    1459 NameNode
    2242 Jps
    1907 ResourceManager

    # jps (从节点)

    4520 Jps
    9677 NodeManager
    9526 DataNode

    这时可以浏览器打开 IP:8088 和 IP:50070 就可以查看集群状态NameNode信息了

    Hadoop Shell命令:

    http://blog.csdn.net/wuwenxiang91322/article/details/22166423

    http://hadoop.apache.org/docs/r1.0.4/cn/hdfs_shell.html

  • 相关阅读:
    题解 CF171G 【Mysterious numbers
    题解 P1157 【组合的输出】
    题解 P3955 【图书管理员】
    题解 P2036 【Perket】
    题解 CF837A 【Text Volume】
    题解 CF791A 【Bear and Big Brother】
    题解 CF747A 【Display Size】
    题解 P1332 【血色先锋队】
    题解 P2660 【zzc 种田】
    题解 P4470 【[BJWC2018]售票】
  • 原文地址:https://www.cnblogs.com/wjoyxt/p/5509624.html
Copyright © 2011-2022 走看看