zoukankan      html  css  js  c++  java
  • 八、Linux 上搭建 Hadoop 集群

    参考:
    https://www.cnblogs.com/yanshw/p/11535633.html
    https://www.cnblogs.com/frankdeng/p/9047698.html

    集群时间同步参考:
    https://www.cnblogs.com/frankdeng/p/9005691.html

    解压,配置环境变量,使生效
    export HADOOP_HOME=/home/hadoop/hadoop-2.7.7
    export PATH=$PATH:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin

    修改配置文件
    目录:/home/hadoop/hadoop-2.7.7/etc/hadoop

    1、vi hadoop-env.sh
    export JAVA_HOME=/home/hadoop/jdk1.8.0_131
    2、vi mapred-env.sh
    export JAVA_HOME=/home/hadoop/jdk1.8.0_131
    3、vi yarn-env.sh
    export JAVA_HOME=/home/hadoop/jdk1.8.0_131

    4、vi slaves
    storm-01
    storm-02
    storm-03

    4个xml core、hdfs、mapred、yarn
    5、vi core-site.xml
    `
    <configuration>
    <!-- 指定HDFS中NameNode的地址 -->
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://storm-01:9000</value>
    </property>
    <!-- 指定hadoop运行时产生文件的存储目录 -->
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hadoop/dataDir/hadoop</value>
    </property>
    </configuration>
    `

    6、vi hdfs-site.xml
    `
    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>2</value>
    </property>
    <!-- The secondary namenode http server address and port. -->
    <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>storm-03:50090</value>
    </property>
    <configuration>
    `

    7、mv mapred-site.xml.template mapred-site.xml
    vi mapred-site.xml
    `
    <configuration>
    <!-- 指定mr运行在yarn上 -->
    <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    </property>
    </configuration>
    `

    8、vi yarn-site.xml
    `
    <configuration>
    <!-- reducer获取数据的方式 -->
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>
    <!-- 指定YARN的ResourceManager的地址 -->
    <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>storm-02</value>
    </property>
    </configuration>
    `

    分发到其他机器

    如果集群是第一次启动,需要格式化namenode
    hdfs namenode -format 
    启动Hdfs:
    start-dfs.sh
    启动Yarn: 注意:Namenode和ResourceManger如果不是同一台机器,不能在NameNode上启动 yarn,应该在ResouceManager所在的机器上启动yarn。
    start-yarn.sh

    1)各个服务组件逐一启动
    分别启动hdfs组件: hadoop-daemon.sh start|stop namenode|datanode|secondarynamenode
    启动yarn: yarn-daemon.sh start|stop resourcemanager|nodemanager
    2)各个模块分开启动(配置ssh是前提)常用
    start|stop-dfs.sh start|stop-yarn.sh
    3)全部启动(不建议使用)
    start|stop-all.sh

    远程访问 hadoop 集群
    namenode 的 IP :storm-01
    50070 端口 访问 hdfs http://storm-01:50070
    8088 端口 访问 mapreduce http://storm-01:8088

    测试1:
    hadoop fs -mkdir -p /test/hankang/20200609
    hadoop fs -put test.txt /test/hankang/20200609
    hadoop jar /home/hadoop/hadoop-2.7.7/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar wordcount /test/hankang/20200609 /test/hankang/output/20200609

    测试2:
    hadoop fs -mkdir -p /test/hankang/20200609
    hadoop fs -put /home/hadoop/hadoop-2.7.7/README.txt /test/hankang/20200609
    hadoop jar /home/hadoop/hadoop-2.7.7/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar wordcount /test/hankang/20200609/README.txt /test/hankang/output/20200609

    ============================================ 分界线 ===========================================================================
    Hadoop重新格式化HDFS
    参考:
    https://www.jianshu.com/p/a4f4f57ad3d8
    https://www.cnblogs.com/neo98/articles/6305999.html

    每一次format主节点namenode,dfs/name/current目录下的VERSION文件会产生新的clusterID、namespaceID。但是如果子节点的dfs/name/current仍存在,hadoop格式化时就不会重建该目录,因此形成子节点的clusterID、namespaceID与主节点(即namenode节点)的clusterID、namespaceID不一致。最终导致hadoop启动失败。
    同理,data也是如此。

    hadoop.tmp.dir /tmp/hadoop-${user.name}
    dfs.namenode.name.dir file://${hadoop.tmp.dir}/dfs/name
    dfs.datanode.data.dir file://${hadoop.tmp.dir}/dfs/data


    rm -rf /home/hadoop/dataDir/hadoop/*
    修改配置
    在namenode节点执行:
    hdfs namenode -format

    ============================================ 分界线 ===========================================================================
    查看Hadoop默认配置
    参考:
    https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html#Configuring_Environment_of_Hadoop_Daemons
    https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
    https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/core-default.xml

    hadoop.tmp.dir /tmp/hadoop-${user.name}
    dfs.namenode.name.dir file://${hadoop.tmp.dir}/dfs/name
    dfs.datanode.data.dir file://${hadoop.tmp.dir}/dfs/data

  • 相关阅读:
    CSS 会被继承的属性
    List的遍历和删除元素
    java中unicode和中文相互转换
    Hibernate注解方式一对多自关联关系映射
    HQL: Hibernate查询语言
    java replaceall 使用正则表达式替换单等号,不替换其他相关的等号。
    Java 将图片转二进制再将二进制转成图片
    返回上一页并刷新与返回不刷新代码
    Css圆角边框
    jquery mobile
  • 原文地址:https://www.cnblogs.com/tianxiu/p/13139560.html
Copyright © 2011-2022 走看看