zoukankan      html  css  js  c++  java
  • hadoop 2.6 安装配置

    一 环境

    机器

    18台物理机

    系统:

    CentOS release 6.6 (Final)

    java版本:

    14:27 [root@hostname]$ java -version
    java version "1.7.0_65"
    OpenJDK Runtime Environment (rhel-2.5.1.2.el6_5-x86_64 u65-b17)
    OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
    
    

    hadoop 版本:

    hadoop-2.6.0.tar.gz

    下载地址:http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz

    spark 版本:

    spark-1.4.1-bin-hadoop2.6.tgz

    下载地址:http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz

    二 ssh 配置

    配置hosts

    配置/etc/hosts

    18台机器全部配置

    新建用户组

    我们后面将使用hadoop:hadoop来使用hadoop,那么先创建hadoop用户,用户组

    14:56 [root@a03]$ groupadd hadoop;adduser -g hadoop hadoop;passwd hadoop
    Changing password for user hadoop.
    New password:
    Retype new password:
    passwd: all authentication tokens updated successfully.

    18台机器都需要新建

    生成master公钥

    注意先切换到hadoop用户

    
    

    15:51 [hadoop@a01]$ cd .ssh/   # 如果没有该目录,先执行一次ssh localhost
    tty:[1] jobs:[0] cwd:[~/.ssh]
    15:52 [hadoop@a01]$ ssh-keygen -t rsa   # 一直按回车就可以,生成的密钥保存为.ssh/id_rsa
    Generating public/private rsa key pair.
    Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
    Enter passphrase (empty for no passphrase):
    Enter same passphrase again:
    Your identification has been saved in /home/hadoop/.ssh/id_rsa.
    Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
    The key fingerprint is:
    1b:d8:7b:66:a6:8f:0d:04:82:26:ac:aa:c7:35:84:c6 hadoop@a01
    The key's randomart image is:
    +--[ RSA 2048]----+
    | |
    |. . |
    |.oo.. . |
    |.oE .. + |
    |.. . . S |
    |. o . + |
    |.. . . + = |
    |. o X |
    |.. o.o |
    +-----------------+

    拷贝master公钥到datanode机器上

    16:54 [hadoop@a01]$ pwd
    /home/hadoop/.ssh
    16:51 [hadoop@a01]$ ssh-copy-id hadoop@a02
    The authenticity of host '[a02]:22022 ([10.xxx.x.xx]:22022)' can't be established.
    RSA key fingerprint is ad:xx:xx:xx:xx:xx:2b:xx:xx:4a:46:xx:xx:xx:d3:3b.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added '[a02]:22022,[10.xxx.x.xx]:22022' (RSA) to the list of known hosts.
    hadoop@a02's password: 
    Now try logging into the machine, with "ssh 'hadoop@a02'", and check in:
    
      .ssh/authorized_keys
    
    to make sure we haven't added extra keys that you weren't expecting.

     拷到所有的17台datanode上.验证一下

    17:00 [hadoop@a01]$ ssh a02
    Last login: Wed Sep  9 16:49:27 2015 from 10.107.7.49
    tty:[1] jobs:[0] cwd:[~]
    17:00 [hadoop@a02]$ 

    看到已经不需要密码了

    三 hadoop安装

    解压

    17:20 [root@a01]$ tar -zxvf hadoop-2.6.0.tar.gz -C /usr/local/

    将hadoop 分配给hadoop用户用户组

    17:21 [root@a01]$ cd /usr/local/ 
    tty:[1] jobs:[0] cwd:[/usr/local]
    17:22 [root@a01]$ chown -R hadoop:hadoop hadoop-2.6.0/

    配置集群

    集群/分布式模式需要修改 etc/hadoop 中的5个配置文件,后四个文件可点击查看官方默认设置值,这里仅设置了正常启动所必须的设置项: slaves、core-site.xmlhdfs-site.xmlmapred-site.xmlyarn-site.xml 。

    slaves 配置

    17:25 [hadoop@a01]$ vim slaves
    a02
    a03
    a04
    a05
    a06
    a07
    a08
    a09
    a10
    a11
    a12
    a13
    a14
    a15
    a16
    a17
    a18

    core-site.xml 配置

    <configuration>
     <property>
        <name>fs.defaultFS</name>
        <value>hdfs://a01:9000</value>
     </property>
    

     <property>
    <name>hadoop.tmp.dir</name>
    <value>/opt/hadoop/local/</value>
    <description>Abase for other temporary directories.</description>
    </property>

      <!-- file system properties -->
    
    
     <property>
       <name>fs.file.impl</name>
       <value>org.apache.hadoop.fs.LocalFileSystem</value>
       <description>The FileSystem for file: uris.</description>
     </property>
    
     <property>
       <name>fs.hdfs.impl</name>
       <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
       <description>The FileSystem for hdfs: uris.</description>
     </property>
    
    
     <property>
      <name>fs.trash.interval</name>
      <value>4320</value>
     </property>
     <property>
       <name>fs.trash.checkpoint.interval</name>
       <value>432</value>
     </property>
    </configuration>

    hdfs-site.xml 配置

    <configuration>
    <!-- base config  -->
     <property>
       <name>dfs.nameservices</name>
       <value>a01</value>
     </property>
     <property>
       <name>dfs.block.size</name>
       <value>268435456</value> 
       <description>The default block size for new files.</description>
     </property>
     
     <property>
       <name>dfs.replication</name>
       <value>3</value>
       <description>Default block replication.
                  The actual number of replications can be specified when the file is created.
                    The default is used if replication is not specified in create time.
       </description>
     </property>
     
     
     <property>
       <name>dfs.datanode.data.dir.perm</name>
       <value>770</value>
       <description>Permissions for the directories on on the local filesystem where
                     the DFS data node store its blocks. The permissions can either be octal or
                      symbolic.</description>
     </property>
    </configuration>

    mapred-site.xml 配置

    文件mapred-site.xml,这个文件不存在,首先需要从模板中复制一份:

    $ cp mapred-site.xml.template mapred-site.xml 

    然后配置修改如下:

     <property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
     </property>
    
    
    <property>
      <name>mapred.local.dir</name>
      <value>/opt/mapred/local</value>
    </property>
    
    
    <property>
      <name>mapred.map.tasks</name>
      <value>10</value>
    </property>
    
    <property>
      <name>mapred.reduce.tasks</name>
      <value>20</value>
    </property>
    
    <!-- i/o properties  config-->
    
    <property>
      <name>mapreduce.output.fileoutputformat.compress.type</name>
      <value>BLOCK</value>
    </property>
    
    <property>
      <name>mapreduce.output.fileoutputformat.compress.codec</name>
      <value>org.apache.hadoop.io.compress.GzipCodec</value>
    </property>
    
    <property>
      <name>mapreduce.output.fileoutputformat.compress</name>
      <value>false</value>
    </property>
    
    <property>
      <name>mapred.output.compression.type</name>
      <value>BLOCK</value>
    <property>
      <name>mapred.compress.map.output</name>
      <value>true</value>
    </property>
    
    
    <property>
      <name>mapred.map.output.compression.codec</name>
      <value>com.hadoop.compression.lzo.LzopCodec</value>
    </property>
    
    
    <property>
      <name>mapreduce.task.userlog.limit.kb</name>
      <value>1024</value>
    </property>
    
    
    <property>
      <name>io.sort.factor</name>
      <value>100</value>
    </property>
    
    <property>
      <name>io.sort.mb</name>
      <value>100</value>
    </property>
    
    <property>
      <name>io.sort.record.percent</name>
      <value>0.05</value>
      <description>The percentage of io.sort.mb dedicated to tracking record
      boundaries. Let this value be r, io.sort.mb be x. The maximum number
      of records collected before the collection thread must block is equal
      to (r * x) / 4</description>
    </property>
    
    <property>
      <name>io.sort.spill.percent</name>
      <value>0.80</value>
    </property>
    
    
    <property>
      <name>mapred.job.shuffle.input.buffer.percent</name>
      <value>0.5</value>
    </property>                                                                                                                                                                                                  
    
    
    <property>
      <name>mapred.map.tasks.speculative.execution</name>
      <value>false</value>
    </property>
    
    <property>
      <name>mapred.reduce.tasks.speculative.execution</name>
      <value>false</value>
    </property>
    
    <property>
      <name>mapreduce.task.timeout</name>
      <value>900000</value>
    </property>
    
    <property>
      <name>mapred.child.java.opts</name>
        <value>-Xmx1024m</value>
     </property>

    看标红的mapred.local.dir配置,要提前在所有的机器上建好所配置的路径,这里是/opt/mapred/local/ ,并给hadoop用户分配权限.

    $  mkdir -p /opt/mapred/local;chown -R hadoop:hadoop /opt/mapred/

    这个配置是 mapred做本地计算所使用的文件夹,可以配置多块硬盘,逗号分隔.

    yarn-site.xml配置

    <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>a01</value>
    </property>
    <!-- Site specific YARN configuration properties -->
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>
    
    <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    
    <property>
    <name>yarn.nodemanager.local-dirs</name>
    <value>/opt/yarn/local</value>
    <description>the local directories used by thenodemanager</description>
    </property>

    看 yarn.nodemanager.local-dirs 配置,要提前在所有的机器上建好所配置的路径,这里是/opt/yarn/local/ ,并给hadoop用户分配权限.

    mkdir -p /opt/yarn/local;chown -R hadoop:hadoop /opt/yarn

    hadoop-env.sh 配置

    hadoop-env.sh中java_home设置为java的安装目录

    export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.65.x86_64/

    配置全部完成,将hadoop目录打包复制到所有节点上

    $ tar -zcf ./hadoop.tar.gz ./hadoop
    # for i in {2..9}; do scp hadoop.tgz a0$i:~/;done
    # for i in {10..18}; do scp hadoop.tgz a$i:~/;done

    再去各节点,解压包

    $ cd  /usr/local/
    $ tar -xvf /home/hadoop/hadoop.tgz 
    $ chown -R hadoop:hadoop hadoop-2.6.0/

    给所有节点配置hadoop环境变量

    在/etc/profile

    export HADOOP_HOME=/usr/local/hadoop-2.6.0
    export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop/
    export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

    修改完成后 source 一下

    source /etc/profile

    启动hadoop

    $ cd /usr/local/hadoop-2.6.0/
    $ bin/hdfs namenode -format       # 首次运行需要执行初始化,后面不再需要
    $ start-dfs.sh
    $ start-yarn.sh

    通过jps 可以看到 namenode 节点上启动 namenode,reourcemanager进程

    20:04 [hadoop@a01]$ jps
    6298 NameNode
    6719 ResourceManager
    7040 Jps

    datanode 节点上启动了 NodeManager,Datanode 进程

    19:55 [root@a02]$ jps
    4296 NodeManager
    4177 DataNode
    4466 Jps

    可以从 a01:50070看到状态

     可以看到17个节点全部正常运行

    问题解决

    1.集群总容量不对

    但是看到容量的数据有问题,17台机器的总量竟然只有243G,每台都是1t的硬盘,看下系统的磁盘分区

    10:12 [hadoop@a01$ df -h
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/sda2        15G  3.2G   11G  24% /
    tmpfs            32G     0   32G   0% /dev/shm
    /dev/sda1       190M   27M  153M  16% /boot
    /dev/sda5       898G   75M  853G   1% /opt

    看到是一个盘分了4个分区.主分区是15G,  15* 17=255 ,那么应该就是hadoop只用的主分区.

    在core-site.xml 中配置 hadoop.tmp.dir ,配置到 /opt/hadoop/local 下.默认这个配置是 /tmp/hadoop-${user}/

    <property>
     <name>hadoop.tmp.dir</name>
     <value>/opt/hadoop/local/</value>
     <description>Abase for other temporary directories.</description>
     </property>

    下面先新建 /opt/hadoop/local/ ,并分配给hadoop用户 用户组,

    再将更改后的配置复制到所有节点上,

    重新格式化namenode,

    重启hadoop

    $ mkdir -p /opt/hadoop/local;chown -R hadoop:hadoop /opt/hadoop
    $ stop-all.sh
    $ for i in {50..66};do scp core-site.xml xx.xxx.x.$i:/usr/local/hadoop-2.6.0/etc/hadoop/;done
    $ cd $HADOOP_HOME
    $ bin/hdfs namenode -format
    $ start-all.sh

    启动成功了,再看下 namenode:50070

    看到现在集群总存储是14.9T 了,并且 namenode storage 也变成 /opt/hadoop/local/dfs/namenode了.问题解决

    2.集群总内存,总cpu不对

     

     看到总内存是136G,总核数也是136个,每台机器为8G 8核,但是机器的内存是64g 的 24 核的.

     查了一下,是yarn中默认设置内存与cpu都是8.

     这个值需要根据自己集群的情况来调整,调优方案见

    http://blog.javachen.com/2015/06/05/yarn-memory-and-cpu-configuration.html

    调整后的yarn 配置

    <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>a01</value>
    </property>
    <!-- Site specific YARN configuration properties -->
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>
    
    <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    
    <property>
    <name>yarn.nodemanager.local-dirs</name>
    <value>/opt/yarn/local</value>
    <description>the local directories used by thenodemanager</description>
    </property>
    
    <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>56832</value>
    </property>
    <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>2048</value>
    </property>
    <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>56832</value>
    </property>
    <property>
    <name>yarn.app.mapreduce.am.resource.mb</name>
    <value>4096</value>
    </property>
    <property>
    <name>yarn.app.mapreduce.am.command-opts</name>
    <value>-Xmx3276m</value>
    </property>

    再看rn 页面

     看到现在每台机器是55g 内存.问题解决

  • 相关阅读:
    连续多步骤业务流程的暂停、中断和恢复
    什么是XML
    泛型擦除和反射配置文件
    类加载器和反射
    网络安全协议(二)
    网络通信协议(一)
    多线程之线程安全
    JAVA之线程池
    JAVA之多线程
    2020/8/1 JAVA之IO流(四)
  • 原文地址:https://www.cnblogs.com/pingjie/p/4797604.html
Copyright © 2011-2022 走看看