zoukankan      html  css  js  c++  java
  • cdh5+hive+zookeeper集群环境搭建

    环境

    1、centos6.5(64位)

    机器规划及节点分布

    主机角色节点节点节点节点节点
    192.168.115.132 master namenode   journalnode zk hive
    192.168.115.133 slave1 namenode datanode journalnode zk hive
    192.168.115.134 slave2   datanode journalnode zk  

    目录设置

    dfs.namenode.name.dir = file:/home/hadoop/data/name

    dfs.datanode.data.dir = file:/home/hadoop/data/datanode

    dfs.namenode.edits.dir = file:/home/hadoop/data/hdfs/edits

    dfs.journalnode.edits.dir = /home/hadoop/data/journaldata/jn

    dfs.hosts.exclude = /home/hadoop/app/hadoop-2.6.0-cdh5.8.0/etc/hadoop/excludes 文件

    pid目录:/home/hadoop/data/pid

    临时目录:/home/hadoop/data/tmp

    安装

    1、分别在三台集群节点创建hadoop用户

    2、jdk安装

    下载jdk-8u65-linux-x64.tar.gz版本, 在hadoop的用户主目录修改.bash_profile设置jdk环境变量

    3、配置主机别名

    • 1、修改vi/etc/hosts(注意:三台机器上都需要添加,踩过此坑)

      192.168.115.132 master

      192.168.115.133 slave1

      192.168.115.134 slave2

      分别在三台节点上执行

      hostname master

      hostname slave1

      hostname slave2

      避免开机启动别名失效,需要修改 vi /etc/sysconfig/network中的hostname

    4、关闭防火墙

    1、查看防火墙的状态:service iptables status

    2、关闭防火墙:service iptables stop

    3、开机启动时防火墙也关闭:

    • chkconfig --list | grep iptables
    • chkconfig iptables off

    5、关闭selinux

    1、编辑vi /etc/sysconfig/selinux设置SELINUX=disabled

    6、普通用户设置sudo权限

    7、配置ssh免密钥登陆

    • 1、到用户的当前目录下执行:ssh-keygen -t rsa 一路回车
    • 2、执行:cp id_rsa.pub authorized_keys
    • 3、其它机器需要执行ssh命令生成私钥和公钥
    • 4、将其它机器生成的公钥里的内容拷贝到主节点的生成的authorized_keys 文件中
    • 5、将配置拥有三台机器公钥的authorized_keys 文件 分发到其它机器上 scp authorized_keys hadoop@slave1:~/.ssh/
    • 6、修改权限 hadoop 用户登陆 chmod 700 ~/.ssh chmod 600 ~/.ssh/authorized_keys

    zookeeper 集群安装

    • 1、解压zookeeper压缩包,修改conf/zoo.cfg文件
    • 在hadoop目录创建相应的zookeeper data目录 mkdir /home/hadoop/data/zookeeper (每台机器都是相同的配置)
    • 在 /home/hadoop/data/zookeeper这个目录下,创建myid文件,写入对应的serverId 启动zookeeper 到bin目录下 /zkServer.sh start

    hadoop-2.6.0-cdh5.8.0.tar 集群安装

    • 在hadoop目录下创建相应的目录

    mkdir -p /home/hadoop/data/name

    mkdir -p /home/hadoop/data/datanode

    mkdir -p /home/hadoop/data/hdfs/edits

    mkdir -p /home/hadoop/data/journaldata/jn

    mkdir -p /home/hadoop/data/pid

    mkdir -p /home/hadoop/data/tmp

    • 修改配置文件

    hadoop-env.sh 文件

      1 # Licensed to the Apache Software Foundation (ASF) under one
      2 # or more contributor license agreements.  See the NOTICE file
      3 # distributed with this work for additional information
      4 # regarding copyright ownership.  The ASF licenses this file
      5 # to you under the Apache License, Version 2.0 (the
      6 # "License"); you may not use this file except in compliance
      7 # with the License.  You may obtain a copy of the License at
      8 #
      9 #     http://www.apache.org/licenses/LICENSE-2.0
     10 #
     11 # Unless required by applicable law or agreed to in writing, software
     12 # distributed under the License is distributed on an "AS IS" BASIS,
     13 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     14 # See the License for the specific language governing permissions and
     15 # limitations under the License.
     16 
     17 # Set Hadoop-specific environment variables here.
     18 
     19 # The only required environment variable is JAVA_HOME.  All others are
     20 # optional.  When running a distributed configuration it is best to
     21 # set JAVA_HOME in this file, so that it is correctly defined on
     22 # remote nodes.
     23 
     24 # The java implementation to use.
     25 export JAVA_HOME=/usr/local/java/jdk1.8.0_65
     26 export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.8.0
     27 export PATH=$PATH:$HADOOP_HOME/bin
     28 export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
     29 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${HADOOP_HOME}/lib/native:${HADOOP_HOME}/lib/native/Linux-amd64-64
     30 
     31 
     32 
     33 # The jsvc implementation to use. Jsvc is required to run secure datanodes
     34 # that bind to privileged ports to provide authentication of data transfer
     35 # protocol.  Jsvc is not required if SASL is configured for authentication of
     36 # data transfer protocol using non-privileged ports.
     37 #export JSVC_HOME=${JSVC_HOME}
     38 
     39 export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
     40 
     41 # Extra Java CLASSPATH elements.  Automatically insert capacity-scheduler.
     42 for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
     43   if [ "$HADOOP_CLASSPATH" ]; then
     44     export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
     45   else
     46     export HADOOP_CLASSPATH=$f
     47   fi
     48 done
     49 
     50 # The maximum amount of heap to use, in MB. Default is 1000.
     51 export HADOOP_HEAPSIZE=512
     52 #export HADOOP_NAMENODE_INIT_HEAPSIZE=""
     53 
     54 # Extra Java runtime options.  Empty by default.
     55 export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
     56 
     57 # Command specific options appended to HADOOP_OPTS when specified
     58 export HADOOP_NAMENODE_OPTS="-Xmx512m -Xms512m -Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
     59 export HADOOP_DATANODE_OPTS="-Xmx256m -Xms256m -Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"
     60 
     61 export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"
     62 
     63 export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
     64 export HADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"
     65 
     66 # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
     67 export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
     68 #HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"
     69 
     70 # On secure datanodes, user to run the datanode as after dropping privileges.
     71 # This **MUST** be uncommented to enable secure HDFS if using privileged ports
     72 # to provide authentication of data transfer protocol.  This **MUST NOT** be
     73 # defined if SASL is configured for authentication of data transfer protocol
     74 # using non-privileged ports.
     75 export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}
     76 
     77 # Where log files are stored.  $HADOOP_HOME/logs by default.
     78 #export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER
     79 
     80 # Where log files are stored in the secure data environment.
     81 export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}
     82 
     83 ###
     84 # HDFS Mover specific parameters
     85 ###
     86 # Specify the JVM options to be used when starting the HDFS Mover.
     87 # These options will be appended to the options specified as HADOOP_OPTS
     88 # and therefore may override any similar flags set in HADOOP_OPTS
     89 #
     90 # export HADOOP_MOVER_OPTS=""
     91 
     92 ###
     93 # Advanced Users Only!
     94 ###
     95 
     96 # The directory where pid files are stored. /tmp by default.
     97 # NOTE: this should be set to a directory that can only be written to by
     98 #       the user that will run the hadoop daemons.  Otherwise there is the
     99 #       potential for a symlink attack.
    100 export HADOOP_PID_DIR=/home/hadoop/data/pid
    101 export HADOOP_SECURE_DN_PID_DIR=/home/hadoop/data/pid
    102 
    103 # A string representing this instance of hadoop. $USER by default.
    104 export HADOOP_IDENT_STRING=$USER

    core-site.xml 文件

     1 <?xml version="1.0" encoding="UTF-8"?>
     2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
     3 <!--
     4   Licensed under the Apache License, Version 2.0 (the "License");
     5   you may not use this file except in compliance with the License.
     6   You may obtain a copy of the License at
     7 
     8     http://www.apache.org/licenses/LICENSE-2.0
     9 
    10   Unless required by applicable law or agreed to in writing, software
    11   distributed under the License is distributed on an "AS IS" BASIS,
    12   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    13   See the License for the specific language governing permissions and
    14   limitations under the License. See accompanying LICENSE file.
    15 -->
    16 
    17 <!-- Put site-specific property overrides in this file. -->
    18 
    19 <configuration>
    20   <property>
    21     <name>fs.defaultFS</name>
    22     <value>hdfs://mycluster</value>
    23     <description>hdfs des</description>
    24   </property>
    25 
    26   <property>
    27     <name>hadoop.tmp.dir</name>
    28     <value>/home/hadoop/data/tmp</value>
    29     <description>data tmp des</description>
    30   </property>
    31 
    32   <property>
    33     <name>io.native.lib.available</name>
    34     <value>true</value>
    35     <description>should native hadoop libraries</description>
    36   </property>
    37 
    38   <property>
    39     <name>fs.trash.interval</name>
    40     <value>1440</value>
    41   </property>
    42 
    43   <property>
    44     <name>io.compression.codecs</name>
    45     <value>org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec</value>
    46     <description>compression</description>
    47   </property>
    48 
    49 </configuration>

    hdfs-site.xml 文件

      1 <?xml version="1.0" encoding="UTF-8"?>
      2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
      3 <!--
      4   Licensed under the Apache License, Version 2.0 (the "License");
      5   you may not use this file except in compliance with the License.
      6   You may obtain a copy of the License at
      7 
      8     http://www.apache.org/licenses/LICENSE-2.0
      9 
     10   Unless required by applicable law or agreed to in writing, software
     11   distributed under the License is distributed on an "AS IS" BASIS,
     12   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     13   See the License for the specific language governing permissions and
     14   limitations under the License. See accompanying LICENSE file.
     15 -->
     16 
     17 <!-- Put site-specific property overrides in this file. -->
     18 
     19 <configuration>
     20   <property>
     21     <name>dfs.nameservices</name>
     22     <value>mycluster</value>
     23     <description>Comma-separated list of nameservices</description>
     24   </property>
     25 
     26   <property>
     27     <name>dfs.datanode.address</name>
     28     <value>0.0.0.0:50011</value>
     29     <description>The datanode server address and port for data transfer</description>
     30   </property>
     31 
     32   <property>
     33     <name>dfs.datanode.http.address</name>
     34     <value>0.0.0.0:50076</value>
     35     <description>The datanode http server address and port.</description>
     36   </property>
     37 
     38   <property>
     39     <name>dfs.datanode.ipc.address</name>
     40     <value>0.0.0.0:50021</value>
     41     <description>The datanode ipc server address and port.</description>
     42   </property>
     43 
     44   <property>
     45     <name>dfs.namenode.name.dir</name>
     46     <value>file:/home/hadoop/data/name</value>
     47     <description>
     48       Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.
     49     </description>
     50     <final>true</final>
     51   </property>
     52 
     53   <property>
     54     <name>dfs.namenode.edits.dir</name>
     55     <value>file:/home/hadoop/data/hdfs/edits</value>
     56     <description> </description>
     57   </property>
     58 
     59   <property>
     60     <name>dfs.datanode.data.dir</name>
     61     <value>file:/home/hadoop/data/datanode</value>
     62     <description> </description>
     63   </property>
     64 
     65   <property>
     66     <name>dfs.replication</name>
     67     <value>2</value>
     68   </property>
     69 
     70   <property>
     71     <name>dfs.permission</name>
     72     <value>true</value>
     73   </property>
     74 
     75   <property>
     76     <name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
     77     <value>true</value>
     78     <description>
     79       Boolean which enables backend datanode-side support for the experimental DistributedFileSystem#getFileVBlockStorageLocations API.
     80     </description>
     81   </property>
     82 
     83   <property>
     84     <name>dfs.permissions.enabled</name>
     85     <value>true</value>
     86     <description>
     87       Boolean which enables backend datanode-side support for the experimental DistributedFileSystem#getFileVBlockStorageLocations API.
     88     </description>
     89   </property>
     90 
     91   <property>
     92     <name>dfs.ha.namenodes.mycluster</name>
     93     <value>nn1,nn2</value>
     94     <description> </description>
     95   </property>
     96 
     97   <property>
     98     <name>dfs.namenode.rpc-address.mycluster.nn1</name>
     99     <value>master:8030</value>
    100     <description> </description>
    101   </property>
    102 
    103   <property>
    104     <name>dfs.namenode.rpc-address.mycluster.nn2</name>
    105     <value>slave1:8030</value>
    106     <description> </description>
    107   </property>
    108 
    109   <property>
    110     <name>dfs.namenode.http-address.mycluster.nn1</name>
    111     <value>master:50082</value>
    112     <description> </description>
    113   </property>
    114 
    115   <property>
    116     <name>dfs.namenode.http-address.mycluster.nn2</name>
    117     <value>slave1:50082</value>
    118     <description> </description>
    119   </property>
    120 
    121   <property>
    122     <name>dfs.namenode.shared.edits.dir</name>
    123     <value>qjournal://master:8488;slave1:8488;slave2:8488;/test</value>
    124     <description> </description>
    125   </property>
    126 
    127   <property>
    128     <name>dfs.journalnode.edits.dir</name>
    129     <value>/home/hadoop/data/journaldata/jn</value>
    130     <description> </description>
    131   </property>
    132 
    133   <property>
    134     <name>dfs.journalnode.rpc-address</name>
    135     <value>0.0.0.0:8488</value>
    136     <description> </description>
    137   </property>
    138 
    139   <property>
    140     <name>dfs.journalnode.http-address</name>
    141     <value>0.0.0.0:8483</value>
    142     <description></description>
    143   </property>
    144 
    145   <property>
    146     <name>dfs.client.failover.proxy.provider.mycluster</name>
    147     <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    148   </property>
    149 
    150   <property>
    151     <name>dfs.ha.fencing.methods</name>
    152     <value>shell(/bin/true)</value>
    153   </property>
    154 
    155   <property>
    156     <name>dfs.ha.fencing.ssh.connect-timeout</name>
    157     <value>10000</value>
    158   </property>
    159 
    160   <property>
    161     <name>dfs.ha.fencing.methods</name>
    162     <value>sshfence</value>
    163   </property>
    164 
    165   <property>
    166     <name>dfs.ha.automatic-failover.enabled</name>
    167     <value>true</value>
    168   </property>
    169 
    170   <property>
    171     <name>dfs.ha.fencing.methods</name>
    172     <value>sshfence</value>
    173   </property>
    174 
    175   <property>
    176     <name>ha.zookeeper.quorum</name>
    177     <value>slave1:2181</value>
    178   </property>
    179 
    180   <property>
    181     <name>dfs.datanode.max.xcievers</name>
    182     <value>8192</value>
    183   </property>
    184 
    185   <property>
    186     <name>dfs.datanode.max.transfer.threads</name>
    187     <value>4096</value>
    188   </property>
    189 
    190   <property>
    191     <name>dfs.blocksize</name>
    192     <value>64m</value>
    193   </property>
    194 
    195   <property>
    196     <name>dfs.namenode.handler.count</name>
    197     <value>10</value>
    198   </property>
    199 
    200   <property>
    201     <name>dfs.datanode.du.reserved</name>
    202     <value>5368709120</value>
    203   </property>
    204 
    205   <property>
    206     <name>dfs.namenode.fs-limits.min-block-size</name>
    207     <value>1</value>
    208   </property>
    209 
    210   <property>
    211     <name>dfs.namenode.fs-limits.max-blocks-per-file</name>
    212     <value>1048576</value>
    213   </property>
    214 
    215   <property>
    216     <name>dfs.datanode.balance.bandwidthPerSec</name>
    217     <value>3145728</value>
    218   </property>
    219 
    220   <property>
    221     <name>dfs.hosts.exclude</name>
    222     <value>/home/hadoop/app/hadoop-2.6.0-cdh5.8.0/etc/hadoop/excludes</value>
    223   </property>
    224 
    225   <property>
    226     <name>dfs.image.compress</name>
    227     <value>true</value>
    228   </property>
    229 
    230   <property>
    231     <name>dfs.image.compression.codec</name>
    232     <value>org.apache.hadoop.io.compress.DefaultCodec</value>
    233   </property>
    234 
    235   <property>
    236     <name>dfs.image.transfer.timeout</name>
    237     <value>60000</value>
    238   </property>
    239 
    240   <property>
    241     <name>dfs.image.transfer.bandwidthPerSec</name>
    242     <value>4194304</value>
    243   </property>
    244 
    245   <property>
    246     <name>dfs.image.transfer.chunksize</name>
    247     <value>65536</value>
    248   </property>
    249 
    250   <property>
    251     <name>dfs.namenode.edits.noeditlogchannelflush</name>
    252     <value>true</value>
    253   </property>
    254 
    255   <property>
    256     <name>dfs.datanode.failed.volumes.tolerated</name>
    257     <value>0</value>
    258   </property>
    259 </configuration>
    • hdfs文件系统初始化

    启动zookeeer zkServer.sh start

    启动journalnode(所有的journalnode)

      ./sbin/hadoop-daemon.sh start journalnode (以下都是在hadoop的安装目录下执行)

    主节点执行初始化操作(主namenode)

      ./bin/hdfs namenode -format ./bin/hdfs zkfc -formatZK ./bin/hdfs namenode

    备节点同步数据

      ./bin/hdfs namenode -boot 备节点执行(slave1)(在hadoop的安装目录执行) ./bin/hdfs namenode -bootstrapStandby(同步主节点和备节点数据)

    停止 hadoop

      在master按下ctrl+c结束namenode 去各个节点停掉journalnode (在hadoop的安装目录执行) ./sbin/hadoop-daemon.sh stop journalnode

    一键启动hdfs start-dfs.sh

    hive-1.1.0-cdh5.8.0.tar.gz集群安装参考博客:http://yanliu.org/2015/08/13/Hadoop%E9%9B%86%E7%BE%A4%E4%B9%8BHive%E5%AE%89%E8%A3%85%E9%85%8D%E7%BD%AE/

  • 相关阅读:
    CentOS 6.3下部署LVS(NAT)+keepalived实现高性能高可用负载均衡
    三大WEB服务器对比分析(apache ,lighttpd,nginx)
    linux sudo 命令
    linux 添加用户、权限
    LeetCode——Find Largest Value in Each Tree Row
    LeetCode——Single Element in a Sorted Array
    LeetCode——Find All Duplicates in an Array
    LeetCode—— Partition Equal Subset Sum
    LeetCode——Unique Binary Search Trees II
    LeetCode——Is Subsequence
  • 原文地址:https://www.cnblogs.com/gulj/p/6111863.html
Copyright © 2011-2022 走看看