zoukankan      html  css  js  c++  java
  • 1.单机部署hadoop测试环境

    之前看了很多理论上的知识,感觉云里雾里的,所以赶紧着手搭建个单机版的hadoop跑一跑,开启自学大数据技术的第一步~~

      1.在开源的世界里,我就是个土豪,要啥有啥,所以首先你得有个jdk,有钱所以用最新的java8,hadoop使用的是hadoop2.6.0。

      2.配置好java后,可以在/etc/profile里配置好环境变量,方便之后使用,紧接着解压hadoop2.6.0.tar.gz。

      3.接下来配置hadoop,所有的配置文件都在hadoop文件夹下的etc/hadoop中:

     (1)hadoop-env.sh :这个脚本只需要修改最上面的JavaHome即可,修改为自己的java路径

     (2)core-site.xml,mapred-site.xml,hdfs-site.xml这几个配置完事再补上吧~~~,网上挺多的,不过要找自己对应的版本,不然会出很多奇怪的问题。

      4.配置好之后就要启动了

      (1)启动之前首先要把namenode格式化一下,这是第一次启动hadoop需要做的动作,他会把hdfs中所有的东西全部清空掉的,所以要慎用~~

    [qiang@localhost hadoop-2.6.0]$  bin/hadoop namenode -format
    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.
    
    15/08/11 08:25:43 INFO namenode.NameNode: STARTUP_MSG: 
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG:   host = localhost/127.0.0.1
    STARTUP_MSG:   args = [-format]
    STARTUP_MSG:   version = 2.6.0
    .....
    .....
    .....
    15/08/11 08:25:46 INFO namenode.NameNode: SHUTDOWN_MSG: 
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at localhost/127.0.0.1
    ************************************************************/

      格式化会出现一大堆信息,如果没有报错,那么说明之前的配置应该是可以滴~~~

      (2)启动的时候,可以直接使用sbin/start-all.sh,但是这种方式太low,如果集群启动出现错误,那么不会知道是那一部分的问题,不便于问题的排查,所以我们来一个一个启动它

    启动namenode:

    [qiang@localhost hadoop-2.6.0]$ sbin/hadoop-daemon.sh start namenode
    starting namenode, logging to /home/qiang/hadoop-2.6.0/logs/hadoop-qiang-namenode-localhost.localdomain.out

    启动datanode:

    [qiang@localhost hadoop-2.6.0]$ sbin/hadoop-daemon.sh start datanode
    starting datanode, logging to /home/qiang/hadoop-2.6.0/logs/hadoop-qiang-datanode-localhost.localdomain.out

    可以用jps命令查看是否启动

    [qiang@localhost ~]$ jps
    17254 Jps
    16473 NameNode
    16698 DataNode

    当然也可以使用开放的端口在web浏览器上查看:(hdfs开放的端口为50070)

    开了当然要用用他了,看看是不是唬人的,所以我们向hdfs中上传点东西试试:

    [qiang@localhost hadoop-2.6.0]$ bin/hadoop fs -mkdir /home
    [qiang@localhost hadoop-2.6.0]$ bin/hadoop fs -mkdir /home/qiangweikang
    [qiang@localhost hadoop-2.6.0]$ bin/hadoop fs -put README.txt /home/qiangweikang

    点击uitilites中的system source会看到我们之前传进去的东东:

     好开森~~

    完事我们继续启动yarn

    [qiang@localhost hadoop-2.6.0]$ sbin/start-yarn.sh

    在web上就可以看到传说中的那只大象....  ,而且我们可以看到有一个活动的节点(yarn的ResourceManager的默认端口号是8088)

    接下来我们再跑一个demo,看看hadoop是怎么去运行的(在share下有自带的demo可供测试)这个pi的计算很有意思,是对一个圆做投掷飞镖的动作,第一个参数是map操作的次数

    第二个参数是每次投掷多少个飞镖,好高大上啊,pi还可以这样算~~~,难道这就是传说中的概率统计?

    bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 2 100
    Number of Maps  = 2
    Samples per Map = 100
    Wrote input for Map #0
    Wrote input for Map #1
    Starting Job
    15/08/11 08:54:24 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
    15/08/11 08:54:25 INFO input.FileInputFormat: Total input paths to process : 2
    15/08/11 08:54:25 INFO mapreduce.JobSubmitter: number of splits:2
    15/08/11 08:54:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1439308289430_0001
    15/08/11 08:54:26 INFO impl.YarnClientImpl: Submitted application application_1439308289430_0001
    15/08/11 08:54:26 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1439308289430_0001/
    15/08/11 08:54:26 INFO mapreduce.Job: Running job: job_1439308289430_0001
    15/08/11 08:54:41 INFO mapreduce.Job: Job job_1439308289430_0001 running in uber mode : false
    15/08/11 08:54:41 INFO mapreduce.Job:  map 0% reduce 0%
    15/08/11 08:54:51 INFO mapreduce.Job:  map 50% reduce 0%
    15/08/11 08:54:52 INFO mapreduce.Job:  map 100% reduce 0%
    15/08/11 08:55:04 INFO mapreduce.Job:  map 100% reduce 100%
    15/08/11 08:55:05 INFO mapreduce.Job: Job job_1439308289430_0001 completed successfully
    15/08/11 08:55:06 INFO mapreduce.Job: Counters: 49
        File System Counters
            FILE: Number of bytes read=50
            FILE: Number of bytes written=317688
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=526
            HDFS: Number of bytes written=215
            HDFS: Number of read operations=11
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=3
        Job Counters 
            Launched map tasks=2
            Launched reduce tasks=1
            Data-local map tasks=2
            Total time spent by all maps in occupied slots (ms)=14463
            Total time spent by all reduces in occupied slots (ms)=10093
            Total time spent by all map tasks (ms)=14463
            Total time spent by all reduce tasks (ms)=10093
            Total vcore-seconds taken by all map tasks=14463
            Total vcore-seconds taken by all reduce tasks=10093
            Total megabyte-seconds taken by all map tasks=14810112
            Total megabyte-seconds taken by all reduce tasks=10335232
        Map-Reduce Framework
            Map input records=2
            Map output records=4
            Map output bytes=36
            Map output materialized bytes=56
            Input split bytes=290
            Combine input records=0
            Combine output records=0
            Reduce input groups=2
            Reduce shuffle bytes=56
            Reduce input records=4
            Reduce output records=0
            Spilled Records=8
            Shuffled Maps =2
            Failed Shuffles=0
            Merged Map outputs=2
            GC time elapsed (ms)=412
            CPU time spent (ms)=4770
            Physical memory (bytes) snapshot=680353792
            Virtual memory (bytes) snapshot=6324887552
            Total committed heap usage (bytes)=501743616
        Shuffle Errors
            BAD_ID=0
            CONNECTION=0
            IO_ERROR=0
            WRONG_LENGTH=0
            WRONG_MAP=0
            WRONG_REDUCE=0
        File Input Format Counters 
            Bytes Read=236
        File Output Format Counters 
            Bytes Written=97
    Job Finished in 42.318 seconds
    Estimated value of Pi is 3.12000000000000000000

    最后记得把yarn关掉~~

    [qiang@localhost hadoop-2.6.0]$ sbin/stop-yarn.sh 
  • 相关阅读:
    Allegro PCB转换成PADS方法
    Altium Designer只显示某一层,隐藏其他层
    DCDC功率电感(Inductor)选型
    DDR布线教程
    DDR地址、容量计算、Bank理解
    DDR3中的ODT(On-die termination)
    LINUX文件系统操作指令之四
    linux系统之间通过nfs网络文件系统挂载设置方法
    linux消息队列编程实例
    system()函数
  • 原文地址:https://www.cnblogs.com/qiangweikang/p/4723196.html
Copyright © 2011-2022 走看看