zoukankan      html  css  js  c++  java
  • hadoop2.2多hdfs集群安装

    http://www.superwu.cn/2014/02/12/1094/

    基本安装教程按照上面的安装即可,但是安装过程中还是出现了一些问题,现整理如下:

    1、首先使用root用户安装可以,也省去了很多权限问题,但是最好别用,不容易发现问题,添加用户:

    如果是新用户,直接添加,如:useradd hadoop如果已经存在该用户,如果用户已删除,但是组还在,需使用useradd -g hadoop hadoop

    2、修改hadoop用户密码:root用户下修改某个用户密码:passwd hadoop然后直接输入新密码即可

    3、设置完毕,需要给hadoop用户添加操作磁盘目录的权限:chown -R hadoop:hadoop 权限目录(包括安装等目录)

    4、设置ssh无密码连接:

    在主节点上执行(hadoop用户下)ssh-keygen –t rsa,一路回车
    在主节点上执行cp id_ids.pub authorized_keys
    在主节点上执行ssh-copy-id -i 其它节点域名

    5、测试集群安装节点,如下:

    cluster1:

    192.168.157.100 hadoop-kf100.jd.com
    192.168.157.101 hadoop-kf101.jd.com
    192.168.157.102 hadoop-kf102.jd.com

    cluster2:
    192.168.157.103 hadoop-kf103.jd.com
    192.168.157.104 hadoop-kf104.jd.com
    192.168.157.105 hadoop-kf105.jd.com

    各台机器职责如下:

      hadoop-kf100.jd.com hadoop-kf101.jd.com hadoop-kf102.jd.com hadoop-kf103.jd.com hadoop-kf104.jd.com hadoop-kf105.jd.com
    是NameNode吗? 是,属集群cluster1 是,属集群cluster1 是,属集群cluster1 是,属集群cluster2 是,属集群cluster2 是,属集群cluster2
    是DataNode吗?
    是JournalNode吗? 不是 不是 不是
    是ZooKeeper吗? 不是 不是 不是
    是ZKFC吗? 不是 不是

    6、设置环境变量,如下:

    export JAVA_HOME=/export/servers/jdk1.6.0_25
    export JAVA_BIN=/export/servers/jdk1.6.0_25/bin
    export HADOOP_HOME=/usr/local/hadoop-2.2.0

    export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
    export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
    export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native

    export ZOOKEEPER_HOME=/export/servers/zookeeper-3.4.6

    export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH

    7、hadoop压缩到安装目录后,主要修改以下几个文件:

    (1)core-site.xml:

    <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://cluster1</value>--集群1名称,默认hdfs路径,在hdfs-site.xml中定义
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop/tmp</value>--默认是NameNode、DataNode、JournalNode等存放数据的公共目录
    </property>
    <property>
    <name>ha.zookeeper.quorum</name>
    <value>hadoop-kf100.jd.com:2181,hadoop-kf101.jd.com:2181,hadoop-kf103.jd.com:2181</value>--zookeeper集群
    <description>Zookeeper集群</description>
    </property>
    <property>
    <name>hadoop.proxyuser.hadoop.hosts</name>--oozie客户端权限设置
    <value>*</value>
    </property>
    <property>
    <name>hadoop.proxyuser.hadoop.groups</name>
    <value>*</value>
    </property>
    </configuration>

    (2)hadoop-env.sh:

    export JAVA_HOME=/export/servers/jdk1.6.0_25

    (3)hdfs-site.xml(cluster2):

    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>2</value>--数据备份
    </property>
    <property>
    <name>dfs.nameservices</name>
    <value>cluster1,cluster2</value>--两个hdfs集群
    </property>
    <property>
    <name>dfs.ha.namenodes.cluster1</name>
    <value>hadoop100,hadoop101</value>--集群cluster1的namenodes
    </property>
    <property>
    <name>dfs.namenode.rpc-address.cluster1.hadoop100</name>
    <value>hadoop-kf100.jd.com:9000</value>
    </property>
    <property>
    <name>dfs.namenode.http-address.cluster1.hadoop100</name>
    <value>hadoop-kf100.jd.com:50070</value>
    </property>
    <property>
    <name>dfs.namenode.rpc-address.cluster1.hadoop101</name>
    <value>hadoop-kf101.jd.com:9000</value>
    </property>
    <property>
    <name>dfs.namenode.http-address.cluster1.hadoop101</name>
    <value>hadoop-kf101.jd.com:50070</value>
    </property>
    <property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://hadoop-kf100.jd.com:8485;hadoop-kf101.jd.com:8485;hadoop-kf103.jd.com:8485/cluster2</value>
    <description>指定cluster2的3个NameNode共享edits文件目录时,使用的是JournalNode集群来维护</description>
    </property>
    <property>
    <name>dfs.ha.automatic-failover.enabled.cluster1</name>
    <value>true</value>
    </property>
    <property>
    <name>dfs.client.failover.proxy.provider.cluster1</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <property>
    <name>dfs.ha.namenodes.cluster2</name>
    <value>hadoop103,hadoop104</value>
    </property>
    <property>
    <name>dfs.namenode.rpc-address.cluster2.hadoop103</name>
    <value>hadoop-kf103.jd.com:9000</value>
    </property>
    <property>
    <name>dfs.namenode.http-address.cluster2.hadoop103</name>
    <value>hadoop-kf103.jd.com:50070</value>
    </property>
    <property>
    <name>dfs.namenode.rpc-address.cluster2.hadoop104</name>
    <value>hadoop-kf104.jd.com:9000</value>
    </property>
    <property>
    <name>dfs.namenode.http-address.cluster2.hadoop104</name>
    <value>hadoop-kf104.jd.com:50070</value>
    </property>
    <property>
    <name>dfs.ha.automatic-failover.enabled.cluster2</name>
    <value>true</value>
    </property>
    <property>
    <name>dfs.client.failover.proxy.provider.cluster2</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/usr/local/hadoop/tmp/journal</value>
    </property>
    <property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
    </property>
    <property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/home/hadoop/.ssh/id_rsa</value>
    </property>

    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property>
    <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
    </property>


    </configuration>

    (4)hdfs-site.xml(cluster1)两个集群唯一不同的地方是dfs.namenode.shared.edits.dir属性值:

    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>2</value>--数据备份
    </property>
    <property>
    <name>dfs.nameservices</name>
    <value>cluster1,cluster2</value>--两个hdfs集群
    </property>
    <property>
    <name>dfs.ha.namenodes.cluster1</name>
    <value>hadoop100,hadoop101</value>--集群cluster1的namenodes
    </property>
    <property>
    <name>dfs.namenode.rpc-address.cluster1.hadoop100</name>
    <value>hadoop-kf100.jd.com:9000</value>
    </property>
    <property>
    <name>dfs.namenode.http-address.cluster1.hadoop100</name>
    <value>hadoop-kf100.jd.com:50070</value>
    </property>
    <property>
    <name>dfs.namenode.rpc-address.cluster1.hadoop101</name>
    <value>hadoop-kf101.jd.com:9000</value>
    </property>
    <property>
    <name>dfs.namenode.http-address.cluster1.hadoop101</name>
    <value>hadoop-kf101.jd.com:50070</value>
    </property>
    <property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://hadoop-kf100.jd.com:8485;hadoop-kf101.jd.com:8485;hadoop-kf103.jd.com:8485/cluster1</value>
    <description>指定cluster2的3个NameNode共享edits文件目录时,使用的是JournalNode集群来维护</description>
    </property>
    <property>
    <name>dfs.ha.automatic-failover.enabled.cluster1</name>
    <value>true</value>
    </property>
    <property>
    <name>dfs.client.failover.proxy.provider.cluster1</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <property>
    <name>dfs.ha.namenodes.cluster2</name>
    <value>hadoop103,hadoop104</value>
    </property>
    <property>
    <name>dfs.namenode.rpc-address.cluster2.hadoop103</name>
    <value>hadoop-kf103.jd.com:9000</value>
    </property>
    <property>
    <name>dfs.namenode.http-address.cluster2.hadoop103</name>
    <value>hadoop-kf103.jd.com:50070</value>
    </property>
    <property>
    <name>dfs.namenode.rpc-address.cluster2.hadoop104</name>
    <value>hadoop-kf104.jd.com:9000</value>
    </property>
    <property>
    <name>dfs.namenode.http-address.cluster2.hadoop104</name>
    <value>hadoop-kf104.jd.com:50070</value>
    </property>
    <property>
    <name>dfs.ha.automatic-failover.enabled.cluster2</name>
    <value>true</value>
    </property>
    <property>
    <name>dfs.client.failover.proxy.provider.cluster2</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/usr/local/hadoop/tmp/journal</value>
    </property>
    <property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
    </property>
    <property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/home/hadoop/.ssh/id_rsa</value>
    </property>

    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property>
    <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
    </property>


    </configuration>

    (5)mapred-site.xml:

    <configuration>
    <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    </property>
    </configuration>

    (6)slaves:

    hadoop-kf100.jd.com
    hadoop-kf101.jd.com
    hadoop-kf102.jd.com
    hadoop-kf103.jd.com
    hadoop-kf104.jd.com
    hadoop-kf105.jd.com

    (7)yarn-site.xml:

    <configuration>
    <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop-kf100.jd.com</value>
    </property>
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>

    <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
    </property>
    <property>
    <name>yarn.scheduler.fair.allocation.file</name>
    <value>/usr/local/hadoop-2.2.0/etc/hadoop/fair-scheduler.xml</value>
    </property>


    </configuration>

    至此,配置文件修改完毕,除了hdfs-site.xml文件内容稍有差别,其它都一样

    8、下面开始启动:

    启动步骤按照上面链接来就行,但是最好每个命令自己手敲,别随便复制粘贴,容易出问题

    9、下面说一下安装过程中出现的几个问题:

    (1)JournalNode最好是每个hdfs集群安装一个,不然启动namenode的时候会出没有格式化的问题,zookeeper也是

    (2)操作过程中一定要注意用户权限问题,操作不当也很容易出问题

    (3)每个hdfs集群最多支持两个namenode,>=3个会报错

    (4)安装完要测试一下经典的wordcount程序,不报错才行

    (5)安装完在浏览器输入http://hadoop-kf100.jd.com:50070/dfshealth.jsp、http://hadoop-kf100.jd.com:8088没问题即可

    (6)如果10020端口连接失败,则表示是jobhistory进程没启动,启动命令:sh mr-jobhistory-daemon.sh start historyserver

  • 相关阅读:
    php RSA公钥私钥加解密和验证用法
    php格式化RSA公钥私钥字符串
    你的周末时光是什么样的?
    php-redis 消息订阅笔记
    php redis 常用操作
    MySql索引笔记
    yii1框架,事务使用方法
    python 项目打包成exe可执行程序
    Linux修改默认端口
    C++字符串作为参数的传递
  • 原文地址:https://www.cnblogs.com/zhli/p/4989750.html
Copyright © 2011-2022 走看看