zoukankan      html  css  js  c++  java
  • Mac配置本地hadoop

    Mac配置本地hadoop

    这学期要学习大数据,于是在自己的mac上配置了hadoop环境。由于Mac是OSX系统,所以配置方法跟Linux类似

    一、下载hadoop

    官网下载压缩包。

    $ll
    total 598424
    -rwxrwxrwx@ 1 fanghao  staff   292M  3  4 23:16 hadoop-3.0.0.tar.gz
    

    解压

    tar -xzvf hadoop-3.0.0.tar.gz
    

    二、设置环境变量

    vim ~/.bash_profile
    
    export HADOOP_HOME=/Users/fanghao/someSoftware/hadoop-3.0.0
    export HADOOP_HOME_WARN_SUPPRESS=1
    export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
    

    使环境变量生效

    source ~/.bash_profile
    

    三、配置hadoop自己的参数

    进入hadoop-3.0.0/etc/hadoop

    1. 配置hadoop-env.sh

    # The java implementation to use. By default, this environment
    # variable is REQUIRED on ALL platforms except OS X!
    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_111.jdk/Contents/Home
    

    这里写了OSX不必须加这一行,加了也没事

    2. 配置core-site.xml

    指定临时数据文件夹,指定NameNode的主机名和端口

    <!-- Put site-specific property overrides in this file. -->
    <configuration>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://localhost:9000</value>
        </property>
    
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/Users/fanghao/someSoftware/hadoop-3.0.0/data/</value>
        </property>
    </configuration>
    

    3. 配置hdfs-site.xml

    指定HDFS的默认参数副本,因为是单机运行,所以副本数为1

    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
    </configuration>
    

    4. 配置mapred-site.xml

    指定使用yarn集群框架

    <!-- Put site-specific property overrides in this file. -->
    <configuration>
        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
    </configuration>
    

    5. 配置yarn-site.xml

    <configuration>
      <!-- Site specific YARN configuration properties -->
      <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>localhost</value>
      </property>
    
      <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
      </property>
    </configuration>
    

    四、启动

    先格式化

    hadoop namenode -format
    

    然后执行hadoop-3.0.0/sbin中的系统脚本

    start-dfs.sh # 启动DataNode、NameNode、SecondaryNameNode
    start-yarn.sh # 启动NodeManager、ResourceManager
    

    jps命令可以查看这些JVM上的进程

    6178 NodeManager
    6083 ResourceManager
    6292 Jps
    5685 DataNode
    5582 NameNode
    5822 SecondaryNameNode
    

    查看HDFS

    用浏览器登录http://localhost:9870/

    这里要注意的是,在hadoop3.0.0中,这里的端口号改成了9870,不是2.x的50070,官网上有issue

    The patch updates the HDFS default HTTP/RPC ports to non-ephemeral ports. The changes are listed below:
    Namenode ports: 50470 --> 9871, 50070 --> 9870, 8020 --> 9820
    Secondary NN ports: 50091 --> 9869, 50090 --> 9868
    Datanode ports: 50020 --> 9867, 50010 --> 9866, 50475 --> 9865, 50075 --> 9864 
    

    可能遇到的问题

    hadoop的集群控制是通过ssh实现的,因此要在系统偏好设置->共享->远程登录设置成允许

  • 相关阅读:
    Docker容器常用操作命令(镜像的上传、下载、导入、导出、创建、删除、修改、启动等)详解
    [20191213]toad 12下BIND_AWARE提示无效.txt
    [20191206]隐含参数_db_always_check_system_ts.txt
    [20191127]表 full Hash Value的计算.txt
    [20191127]探究等待事件的本源4.txt
    [20191126]探究等待事件的本源2.txt
    [20191125]oracel SQL parsing function qcplgte 2.txt
    [20191125]探究等待事件的本源.txt
    [20191122]oracel SQL parsing function qcplgte.txt
    [20191119]探究ipcs命令输出2.txt
  • 原文地址:https://www.cnblogs.com/fanghao/p/8507912.html
Copyright © 2011-2022 走看看