zoukankan      html  css  js  c++  java
  • Hadoop:部署Hadoop Single Node

    一、环境准备

    1、系统环境

    CentOS 7

    2、软件环境

    • OpenJDK
    # 查询可安装的OpenJDK软件包
    [root@server1] yum search java | grep jdk
    ...
    # 选择1.8.0版本安装,包括运行环境(openjdk)和开发环境(openjdk-devel)
    [root@server1] yum install -y java-1.8.0-openjdk.x86_64 java-1.8.0-openjdk-devel.x86_64
    • SSH
    [root@server1] yum install -y ssh
    • Hadoop

    在mirror.bit.edu.cn/apache/hadoop/common/上下载合适的Hadoop版本,这里选择hadoop-2.7.3.tar.gz

    二、配置Hadoop

    1、解压缩hadoop-2.7.3.tar.gz

    2、配置JAVA_HOME

    [root@server1 hadoop]# vim etc/hadoop/hadoop-env.sh
    # set to the root of your Java installation
      export JAVA_HOME=/usr # 这里一定要注意,是去掉/bin/java的目录

    3、配置系统环境变量

    [root@server1 hadoop]# vim /etc/profile
    ...
    export HADOOP_PREFIX=/usr/local/hadoop
    export PATH=$PATH:$HADOOP/bin
    ...
    [root@server1 hadoop]# source /etc/profile

    三、测试Hadoop

    [root@server1 hadoop]# ./bin/hadoop
    Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]
      CLASSNAME            run the class named CLASSNAME
     or
      where COMMAND is one of:
      fs                   run a generic filesystem user client
      version              print the version
      jar <jar>            run a jar file
                           note: please use "yarn jar" to launch
                                 YARN applications, not this command.
      checknative [-a|-h]  check native hadoop and compression libraries availability
      distcp <srcurl> <desturl> copy file or directories recursively
      archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
      classpath            prints the class path needed to get the
      credential           interact with credential providers
                           Hadoop jar and the required libraries
      daemonlog            get/set the log level for each daemon
      trace                view and modify Hadoop tracing settings

    Most commands print help when invoked w/o parameters.

    四、运行Hadoop

    因为这里只有一台服务器,因此采用Standalone模式运行,执行一个任务

    [root@server1 hadoop]# mkdir input
    [root@server1 hadoop]# cp etc/hadoop/*.xml input
    [root@server1 hadoop]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar grep input output 'dfs[a-z.]+'
    ...
    16/09/01 16:05:25 INFO mapreduce.Job: Counters: 30
            File System Counters
                    FILE: Number of bytes read=1248142
                    FILE: Number of bytes written=2318080
                    FILE: Number of read operations=0
                    FILE: Number of large read operations=0
                    FILE: Number of write operations=0
            Map-Reduce Framework
                    Map input records=1
                    Map output records=1
                    Map output bytes=17
                    Map output materialized bytes=25
                    Input split bytes=121
                    Combine input records=0
                    Combine output records=0
                    Reduce input groups=1
                    Reduce shuffle bytes=25
                    Reduce input records=1
                    Reduce output records=1
                    Spilled Records=2
                    Shuffled Maps =1
                    Failed Shuffles=0
                    Merged Map outputs=1
                    GC time elapsed (ms)=24
                    Total committed heap usage (bytes)=262553600
            Shuffle Errors
                    BAD_ID=0
                    CONNECTION=0
                    IO_ERROR=0
                    WRONG_LENGTH=0
                    WRONG_MAP=0
                    WRONG_REDUCE=0
            File Input Format Counters
                    Bytes Read=123
            File Output Format Counters
                    Bytes Written=23
    ...
    [root@server1 hadoop]# cat output/*
    1       dfsadmin

    五、遇到的问题

    1、找不到java命令

    export JAVA_HOME=/usr,这个hadoop环境变量一定要设置为父目录

    2、metrics.MetricsUtil: Unable to obtain hostName

    [root@server1 hadoop]# vim /etc/hosts
    127.0.0.1    server1
  • 相关阅读:
    Python常用第三方库总结
    Python爬虫技术--入门篇--爬虫介绍
    X sql解惑 25 里程碑问题 答案
    X sql解惑 34 咨询顾问收入问题
    从小变大的照片
    获取属性的顺序
    for...in
    判断元素是否存在
    自由的元素名称
    ES6语法糖-简洁属性表示
  • 原文地址:https://www.cnblogs.com/seastar1989/p/5830327.html
Copyright © 2011-2022 走看看