zoukankan      html  css  js  c++  java
  • 【hadoop】hadoop3.2.0的安装并测试

    前言:前段时间将hadoop01的虚拟机弄的崩溃掉了,也没有备份,重新从hadoop02虚拟上克隆过来的,结果hadoop-eclipse插件一样的编译,居然用不起了,找了3天的原因,最后还是没有解决,只能用hadoop shell 命令去测试了,反正影响不大,只不过用着不方便而已。

    心累中...........

    正文:

    解压安装Hadoop

    [hadoop@hadoop01 ~]$ cp /home/hadoop/Resources/hadoop-3.2.0.tar.gz ~/
    [hadoop@hadoop01 ~]$ tar -zxvf ~/hadoop-3.2.0.tar.gz
    [hadoop@hadoop01 ~]$ cd hadoop-3.2.0
    [hadoop@hadoop01 hadoop-3.2.0]$ ls -l
    total 184
    drwxr-xr-x. 2 hadoop hadoop    203 Jan  8  2019 bin
    drwxr-xr-x. 3 hadoop hadoop     20 Jan  8  2019 etc
    drwxr-xr-x. 2 hadoop hadoop    106 Jan  8  2019 include
    drwxr-xr-x. 3 hadoop hadoop     20 Jan  8  2019 lib
    drwxr-xr-x. 4 hadoop hadoop   4096 Jan  8  2019 libexec
    -rw-rw-r--. 1 hadoop hadoop 150569 Oct 19  2018 LICENSE.txt
    -rw-rw-r--. 1 hadoop hadoop  22125 Oct 19  2018 NOTICE.txt
    -rw-rw-r--. 1 hadoop hadoop   1361 Oct 19  2018 README.txt
    drwxr-xr-x. 3 hadoop hadoop   4096 Jan  8  2019 sbin
    drwxr-xr-x. 4 hadoop hadoop     31 Jan  8  2019 share
    


    配置Hadoop环境变量

    [hadoop@hadoop01 hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/hadoop-env.sh
    编辑文件并保存:
    # The java implementation to use. By default, this environment
    # variable is REQUIRED on ALL platforms except OS X!
    # export JAVA_HOME=
    export JAVA_HOME=/usr/java/jdk1.8.0_11/
    


    配置YARN环境变量

    [hadoop@hadoop01 hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/yarn-env.sh
    编辑文件并保存:
    export JAVA_HOME=/usr/java/jdk1.8.0_11/
    

     

    配置核心组件文件(core-site.xml)

    [hadoop@hadoop01 hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/core-site.xml
    编辑文件并保存:
    <configuration>
    <!--hdfs 的默认地址、端口 访问地址-->
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://hadoop01:9802</value>
    </property>
    <!--hdfs临时路径-->
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hadoop/hadoopdata</value>
    </property>
    </configuration>
    


    配置文件系统(hdfs-site.xml)

    [hadoop@hadoop01 hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/hdfs-site.xml
    编辑文件并保存:
    <configuration>
    <!--hdfs web的地址 -->
    <property>
        <name>dfs.namenode.http-address</name>
        <value>hadoop01:50070</value>
    </property>
    <!-- 副本数-->
    <property>
            <name>dfs.replication</name>
            <value>3</value>
        </property>
    <!-- 是否启用hdfs权限检查 false 关闭 -->
           <property>
            <name>dfs.permissions.enabled</name>
            <value>false</value>
        </property>
    <!-- 块大小,默认字节, 可使用 k m g t p e-->
           <property>
            <name>dfs.blocksize</name>
        <!--128m-->
            <value>134217728</value>
        </property>
        <!--hadoop的name和data目录路径-->
           <property>
             <name>dfs.namenode.name.dir</name>
             <value>file:/home/hadoop/hdfs/name</value>
       </property>
       <property>
             <name>dfs.datanode.data.dir</name>
             <value>file:/home/hadoop/hdfs/data</value>
       </property>
    </configuration>
    


    配置yarn-site.xml文件

    [hadoop@hadoop01 hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/yarn-site.xml
    编辑文件并保存:
    <configuration>
    <!-- Site specific YARN configuration properties -->
    <!--集群master,-->
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop01</value>
    </property>
    
    <!-- NodeManager上运行的附属服务-->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <!--容器可能会覆盖的环境变量,而不是使用NodeManager的默认值-->
    <property>
            <name>yarn.nodemanager.env-whitelist</name>
        <value> JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ</value>
    </property>
    <!-- 关闭内存检测,虚拟机需要,不配会报错-->
    <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
    </property>
    </configuration>
    

     

    配置MapReduce计算框架文件
    [hadoop@hadoop01 hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/mapred-site.xml
    编辑文件并保存:

    <configuration>
    <!--local表示本地运行,classic表示经典mapreduce框架,yarn表示新的框架-->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <!--如果map和reduce任务访问本地库(压缩等),则必须保留原始值
    当此值为空时,设置执行环境的命令将取决于操作系统:
    Linux:LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native.
    windows:PATH =%PATH%;%HADOOP_COMMON_HOME%\bin.
    -->
    <property>
            <name>mapreduce.admin.user.env</name>
            <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
    <!--
    可以设置AM【AppMaster】端的环境变量
    如果上面缺少配置,可能会造成mapreduce失败
    -->
    <property>
            <name>yarn.app.mapreduce.am.env</name>
            <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
    </configuration>
    



    【选】配置slaves文件(hadoop2.x修改slaves)
    [hadoop@hadoop01 hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/slaves
    编辑文件并保存:

    hadoop02
    hadoop03
    


    【选】配置workers文件(hadoop3.x修改workers)
    [hadoop@hadoop01 hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/workers
    编辑文件并保存:

    hadoop01
    hadoop02
    hadoop03
    


    复制hadoop01上的Hadoop到hadoop02和hadoop03节点上
    scp -r /home/hadoop/hadoop-3.2.0 hadoop@hadoop02:~/
    scp -r /home/hadoop/hadoop-3.2.0 hadoop@hadoop03:~/

    配置操作系统环境变量(需要在所有节点上进行,且使用一般用户权限)
    gedit ~/.bash_profile
    source ~/.bash_profile
    编辑文件并保存:

    #以下是新添加入代码
    export JAVA_HOME=/usr/java/jdk1.8.0_11/
    export PATH=$JAVA_HOME/bin:$PATH
    #hadoop
    export HADOOP_HOME=/home/hadoop/hadoop-3.2.0
    export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
    


    创建Hadoop数据目录(所有节点操作)
    mkdir /home/hadoop/hadoopdata

    格式化文件系统(主端进行)
    hdfs namenode -format

    启动和关闭Hadoop
    cd ~/hadoop-3.2.0
    sbin/start-all.sh
    stop-all.sh

    启动成功结果:

    [hadoop@hadoop01 hadoop-3.2.0]$ jps
    20848 DataNode
    21808 Jps
    21076 SecondaryNameNode
    21322 ResourceManager
    20668 NameNode
    21468 NodeManager
    [hadoop@hadoop01 hadoop-3.2.0]$ 
    



    【最后测试】在Hadoop集群中运行程序
    将计算圆周率pi的Java程序包投入运行

    [hadoop@hadoop01 hadoop-3.2.0]$ cd ~/hadoop-3.2.0/share/hadoop/mapreduce
    [hadoop@hadoop01 mapreduce]$ ls
    hadoop-mapreduce-client-app-3.2.0.jar     hadoop-mapreduce-client-hs-plugins-3.2.0.jar       hadoop-mapreduce-client-shuffle-3.2.0.jar   lib
    hadoop-mapreduce-client-common-3.2.0.jar  hadoop-mapreduce-client-jobclient-3.2.0.jar        hadoop-mapreduce-client-uploader-3.2.0.jar  lib-examples
    hadoop-mapreduce-client-core-3.2.0.jar    hadoop-mapreduce-client-jobclient-3.2.0-tests.jar  hadoop-mapreduce-examples-3.2.0.jar         sources
    hadoop-mapreduce-client-hs-3.2.0.jar      hadoop-mapreduce-client-nativetask-3.2.0.jar       jdiff
    [hadoop@hadoop01 mapreduce]$ hadoop jar hadoop-mapreduce-examples-3.2.0.jar
    An example program must be given as the first argument.
    Valid program names are:
      aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
      aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
      bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
      dbcount: An example job that count the pageview counts from a database.
      distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
      grep: A map/reduce program that counts the matches of a regex in the input.
      join: A job that effects a join over sorted, equally partitioned datasets
      multifilewc: A job that counts words from several files.
      pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
      pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
      randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
      randomwriter: A map/reduce program that writes 10GB of random data per node.
      secondarysort: An example defining a secondary sort to the reduce.
      sort: A map/reduce program that sorts the data written by the random writer.
      sudoku: A sudoku solver.
      teragen: Generate data for the terasort
      terasort: Run the terasort
      teravalidate: Checking results of terasort
      wordcount: A map/reduce program that counts the words in the input files.
      wordmean: A map/reduce program that counts the average length of the words in the input files.
      wordmedian: A map/reduce program that counts the median length of the words in the input files.
      wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
    [hadoop@hadoop01 mapreduce]$ hadoop jar hadoop-mapreduce-examples-3.2.0.jar pi 10 10
    Number of Maps  = 10
    Samples per Map = 10
    Wrote input for Map #0
    Wrote input for Map #1
    Wrote input for Map #2
    Wrote input for Map #3
    Wrote input for Map #4
    Wrote input for Map #5
    Wrote input for Map #6
    Wrote input for Map #7
    Wrote input for Map #8
    Wrote input for Map #9
    Starting Job
    2019-08-27 13:47:11,866 INFO client.RMProxy: Connecting to ResourceManager at hadoop01/192.168.1.100:8032
    2019-08-27 13:47:12,179 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1566884685380_0001
    2019-08-27 13:47:12,285 INFO input.FileInputFormat: Total input files to process : 10
    2019-08-27 13:47:12,341 INFO mapreduce.JobSubmitter: number of splits:10
    2019-08-27 13:47:12,372 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
    2019-08-27 13:47:12,479 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1566884685380_0001
    2019-08-27 13:47:12,480 INFO mapreduce.JobSubmitter: Executing with tokens: []
    2019-08-27 13:47:12,645 INFO conf.Configuration: resource-types.xml not found
    2019-08-27 13:47:12,645 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
    2019-08-27 13:47:13,018 INFO impl.YarnClientImpl: Submitted application application_1566884685380_0001
    2019-08-27 13:47:13,099 INFO mapreduce.Job: The url to track the job: http://hadoop01:8088/proxy/application_1566884685380_0001/
    2019-08-27 13:47:13,099 INFO mapreduce.Job: Running job: job_1566884685380_0001
    2019-08-27 13:47:20,205 INFO mapreduce.Job: Job job_1566884685380_0001 running in uber mode : false
    2019-08-27 13:47:20,209 INFO mapreduce.Job:  map 0% reduce 0%
    2019-08-27 13:47:27,371 INFO mapreduce.Job:  map 20% reduce 0%
    2019-08-27 13:47:46,535 INFO mapreduce.Job:  map 20% reduce 7%
    2019-08-27 13:47:50,559 INFO mapreduce.Job:  map 40% reduce 7%
    2019-08-27 13:47:51,570 INFO mapreduce.Job:  map 50% reduce 7%
    2019-08-27 13:47:53,586 INFO mapreduce.Job:  map 60% reduce 7%
    2019-08-27 13:47:58,631 INFO mapreduce.Job:  map 60% reduce 20%
    2019-08-27 13:47:59,641 INFO mapreduce.Job:  map 80% reduce 20%
    2019-08-27 13:48:00,665 INFO mapreduce.Job:  map 100% reduce 20%
    2019-08-27 13:48:01,682 INFO mapreduce.Job:  map 100% reduce 100%
    2019-08-27 13:48:01,708 INFO mapreduce.Job: Job job_1566884685380_0001 completed successfully
    2019-08-27 13:48:01,780 INFO mapreduce.Job: Counters: 54
        File System Counters
            FILE: Number of bytes read=226
            FILE: Number of bytes written=2443397
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=2640
            HDFS: Number of bytes written=215
            HDFS: Number of read operations=45
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=3
            HDFS: Number of bytes read erasure-coded=0
        Job Counters
            Launched map tasks=10
            Launched reduce tasks=1
            Data-local map tasks=10
            Total time spent by all maps in occupied slots (ms)=270199
            Total time spent by all reduces in occupied slots (ms)=31653
            Total time spent by all map tasks (ms)=270199
            Total time spent by all reduce tasks (ms)=31653
            Total vcore-milliseconds taken by all map tasks=270199
            Total vcore-milliseconds taken by all reduce tasks=31653
            Total megabyte-milliseconds taken by all map tasks=276683776
            Total megabyte-milliseconds taken by all reduce tasks=32412672
        Map-Reduce Framework
            Map input records=10
            Map output records=20
            Map output bytes=180
            Map output materialized bytes=280
            Input split bytes=1460
            Combine input records=0
            Combine output records=0
            Reduce input groups=2
            Reduce shuffle bytes=280
            Reduce input records=20
            Reduce output records=0
            Spilled Records=40
            Shuffled Maps =10
            Failed Shuffles=0
            Merged Map outputs=10
            GC time elapsed (ms)=67681
            CPU time spent (ms)=63700
            Physical memory (bytes) snapshot=2417147904
            Virtual memory (bytes) snapshot=30882955264
            Total committed heap usage (bytes)=2966421504
            Peak Map Physical memory (bytes)=382750720
            Peak Map Virtual memory (bytes)=2810384384
            Peak Reduce Physical memory (bytes)=181923840
            Peak Reduce Virtual memory (bytes)=2815541248
        Shuffle Errors
            BAD_ID=0
            CONNECTION=0
            IO_ERROR=0
            WRONG_LENGTH=0
            WRONG_MAP=0
            WRONG_REDUCE=0
        File Input Format Counters
            Bytes Read=1180
        File Output Format Counters
            Bytes Written=97
    Job Finished in 49.977 seconds
    Estimated value of Pi is 3.20000000000000000000
  • 相关阅读:
    约瑟夫问题
    再谈Bellman-Ford
    Uva 11478 Halum操作
    Uva 11090 在环中
    Bellman-Ford
    Uva 10537 过路费
    Uva 10917
    LA 3713 宇航员分组
    2-SAT
    LA 3211 飞机调度
  • 原文地址:https://www.cnblogs.com/CQ-LQJ/p/11602927.html
Copyright © 2011-2022 走看看