zoukankan      html  css  js  c++  java
  • Hadoop安装、Hadoop环境搭建(Apache)版本

      今天早上帮一新人远程搭建Hadoop集群(1.x或者0.22以下版本),感触颇深,在此写下最简单的Apache Hadoop搭建方法,给新人提供帮助,我尽量说得详尽点;点击查看Avatorhadoop搭建步骤。

    1.环境准备:

      1).机器准备:安装目标机器要能相互ping通,所以对于不同机器上的虚拟机要采取"桥连接"的方式进行网络配置(如果是宿主方式,要先关闭宿主机防火墙;上网方式的具体配置方法请google vmvair上网配置、Kvm桥连接上网、Xen在安装的时候就能够手动配置局域网IP,实在不行,请留言);关闭机器的防火墙:/etc/init.d/iptables stop;chkconfig iptables off;修改机器的主机名建议用hadoopservern,n为实际你给机器的机器编号,因为主机名如果含有'_''.'等特殊符号会导致启动问题的。修改机器的/etc/hosts,将IP和hostname的映射关系添加进去。

      2).下载稳定版本Hadoop包并解压,配置Java环境(对于java环境,一般都配置~/.bash_profile,考虑到机器的安全性问题);

      3).免密钥,这里有个小的技巧:在hadoopserver1上

        ssh-kengen -t rsa -P '';一路回车

        ssh-copy-id user@host;

        然后将~/.ssh/目录下的id_rsa和id_rsa.pub,复制到其它机器;

        ssh hadoopserver2;运行scp -r ~/.ssh/authorized_keys hadoopserver1:~/.ssh/;这样所有的免密钥都完成了,可以相互互相ssh;多实际,多学习,网上没有说hadoop免密钥用用ssh-copy-id来简化操作的。

    2.步骤:

      1).在hadoopserver1(namenode)上hadoop解压目录的conf下修改下面几个文件:

        core-site.xml:

          

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
    <property>
              <name>fs.default.name</name>
              <value>hdfs://hadoopserver1:9000</value>
    </property>
    
    <property>
              <name>hadoop.tmp.dir</name>
              <value>/xxx/hadoop-version/tmp</value>
    </property>
    
    </configuration>

        hdfs-site.xml:

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
        <property>
      <name>dfs.permissions</name>
      <value>false</value>
    </property>
        
        <property>
              <name>dfs.replication</name>
              <value>3</value>
            </property>
    
            <property>
              <name>dfs.name.dir</name>
              <value>/xxx/hadoop-version/name</value>
            </property>
    
            <property>
              <name>dfs.data.dir</name>
              <value>/xxx/hadoop-version/data</value>
            </property>
    
            <property>
              <name>dfs.block.size</name>
              <value>670720</value>
            </property>
    <!--
    <property>
      <name>dfs.secondary.http.address</name>
      <value>0.0.0.0:60090</value>
      <description>
        The secondary namenode http server address and port.
        If the port is 0 then the server will start on a free port.
      </description>
    </property>
    
    <property>
      <name>dfs.datanode.address</name>
      <value>0.0.0.0:60010</value>
      <description>
        The address where the datanode server will listen to.
        If the port is 0 then the server will start on a free port.
      </description>
    </property>
    
    <property>
      <name>dfs.datanode.http.address</name>
      <value>0.0.0.0:60075</value>
      <description>
        The datanode http server address and port.
        If the port is 0 then the server will start on a free port.
      </description>
    </property>
    
    <property>
      <name>dfs.datanode.ipc.address</name>
      <value>0.0.0.0:60020</value>
      <description>
        The datanode ipc server address and port.
        If the port is 0 then the server will start on a free port.
      </description>
    </property>
    
    
    
    <property>
      <name>dfs.http.address</name>
      <value>0.0.0.0:60070</value>
      <description>
        The address and the base port where the dfs namenode web ui will listen on.
        If the port is 0 then the server will start on a free port.
      </description>
    </property>
    -->
    
    <property>
      <name>dfs.support.append</name>
      <value>true</value>
      <description>Does HDFS allow appends to files?
                   This is currently set to false because there are bugs in the
                   "append code" and is not supported in any prodction cluster.
      </description>
    </property>
    
    </configuration>

        mapred-site.xml

          

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
    
    
            <property>
              <name>mapred.job.tracker</name>
              <value>hadoopserver1:9001</value>
            </property>
    
            <property>
              <name>mapred.tasktracker.map.tasks.maximum</name>
              <value>2</value>
            </property>
    
            <property>
              <name>mapred.tasktracker.reduce.tasks.maximum</name>
              <value>2</value>
            </property>
    <!--
    <property>    
      <name>mapred.job.tracker.http.address</name>
      <value>0.0.0.0:50030</value>
      <description>
        The job tracker http server address and port the server will listen on.
        If the port is 0 then the server will start on a free port.
      </description>
    </property>
    
    <property>
      <name>mapred.task.tracker.http.address</name>
      <value>0.0.0.0:60060</value>
      <description>
        The task tracker http server address and port.
        If the port is 0 then the server will start on a free port.
      </description>
    </property>
    -->
    
    
    </configuration>

        master中填写的是secondname的hostname,用来告知hadoop在这个机器上启动secondname;

        slaves则标示的是datanode节点,一行一个hostname

      2).修改hadoop-env.sh:

        指定JAVA_HOME到你的java安装目录

        添加一个启动环境:export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true"。用来保证绑定IPV4IP;

      3).手动分发:scp -r hadoop目录 hadoopserver1...n:/相同前缀目录/

      4).启动:

        bin/hadooop namenode -format

        bin/start-all.sh

      5).在浏览器里输入http://hadoopserver1的iP:50070即可查看机器的状态

  • 相关阅读:
    redis-原理-数据结构-链表(二)
    redis-原理-数据结构-SDS(一)
    springMVC源码阅读-解决body不能重复读取问题(十二)
    spring-security使用-安全防护HttpFirewall(七)
    Redis-6.2.1 主从和哨兵模式配置
    Navicat 连接Mysql 8.0以上版本报错1251
    docker logs命令查看容器的日志
    Nginx 反向代理
    docker swarm 删除节点 (解散集群)
    Docker Swarm 命令学习
  • 原文地址:https://www.cnblogs.com/uttu/p/2923090.html
Copyright © 2011-2022 走看看