zoukankan      html  css  js  c++  java
  • 4 weekend110的textinputformat对切片规划的源码分析 + 倒排索引的mr实现 + 多个job在同一个main方法中提交

      

    好的,现在,来weekend110的textinputformat对切片规划的源码分析,

    Inputformat默认是textinputformat,一通百通。

     

     

     

     

    这就是今天,weekend110的textinputformat对切片规划的源码分析入口

    [LocatedFileStatus{path=hdfs://weekend110:9000/wc/srcdata/words.log; isDirectory=false; length=90; replication=1; blocksize=134217728; modification_time=1469247371536; access_time=1469501356933; owner=hadoop; group=supergroup; permission=rw-r--r--; isSymlink=false}]

     

     

     

    [hdfs://weekend110:9000/wc/srcdata/words.log:0+90]

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp

    Found 1 items

    drwx------   - hadoop supergroup          0 2016-07-23 12:25 /tmp/hadoop-yarn

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn

    Found 1 items

    drwx------   - hadoop supergroup          0 2016-07-23 12:26 /tmp/hadoop-yarn/staging

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging

    Found 2 items

    drwx------   - hadoop supergroup          0 2016-07-23 12:25 /tmp/hadoop-yarn/staging/hadoop

    drwxr-xr-x   - hadoop supergroup          0 2016-07-23 12:26 /tmp/hadoop-yarn/staging/history

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging/history

    Found 1 items

    drwxrwxrwt   - hadoop supergroup          0 2016-07-23 12:26 /tmp/hadoop-yarn/staging/history/done_intermediate

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging/history/done_intermediate

    Found 1 items

    drwxrwx---   - hadoop supergroup          0 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop

    Found 48 items

    -rwxrwx---   1 hadoop supergroup      32973 2016-07-23 12:29 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469234255012_0001-1469247921943-hadoop-wc.jar-1469248148068-1-1-SUCCEEDED-default-1469248027901.jhist

    -rwxrwx---   1 hadoop supergroup        347 2016-07-23 12:29 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469234255012_0001.summary

    -rwxrwx---   1 hadoop supergroup      91579 2016-07-23 12:29 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469234255012_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      32957 2016-07-25 19:45 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469446305412_0001-1469447061251-hadoop-wc.jar-1469447138744-1-1-SUCCEEDED-default-1469447093632.jhist

    -rwxrwx---   1 hadoop supergroup        347 2016-07-25 19:45 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469446305412_0001.summary

    -rwxrwx---   1 hadoop supergroup      91579 2016-07-25 19:45 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469446305412_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      33003 2016-07-26 20:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469500058449_0001-1469536528574-hadoop-flow.jar-1469536711053-1-1-SUCCEEDED-default-1469536621793.jhist

    -rwxrwx---   1 hadoop supergroup        349 2016-07-26 20:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469500058449_0001.summary

    -rwxrwx---   1 hadoop supergroup      91594 2016-07-26 20:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469500058449_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      32975 2016-07-27 09:07 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0001-1469581609069-hadoop-flow.jar-1469581669098-1-1-SUCCEEDED-default-1469581639942.jhist

    -rwxrwx---   1 hadoop supergroup        349 2016-07-27 09:07 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0001.summary

    -rwxrwx---   1 hadoop supergroup      91594 2016-07-27 09:07 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      32966 2016-07-27 09:13 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0002-1469581980369-hadoop-flow.jar-1469582016624-1-1-SUCCEEDED-default-1469581991321.jhist

    -rwxrwx---   1 hadoop supergroup        348 2016-07-27 09:13 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0002.summary

    -rwxrwx---   1 hadoop supergroup      91594 2016-07-27 09:13 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0002_conf.xml

    -rwxrwx---   1 hadoop supergroup      32947 2016-07-27 09:34 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0003-1469583259497-hadoop-flow.jar-1469583283697-1-1-SUCCEEDED-default-1469583266059.jhist

    -rwxrwx---   1 hadoop supergroup        347 2016-07-27 09:34 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0003.summary

    -rwxrwx---   1 hadoop supergroup      91594 2016-07-27 09:34 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0003_conf.xml

    -rwxrwx---   1 hadoop supergroup      32973 2016-07-27 09:56 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0004-1469584535785-hadoop-flow.jar-1469584574236-1-1-SUCCEEDED-default-1469584549659.jhist

    -rwxrwx---   1 hadoop supergroup        347 2016-07-27 09:56 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0004.summary

    -rwxrwx---   1 hadoop supergroup      91594 2016-07-27 09:56 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0004_conf.xml

    -rwxrwx---   1 hadoop supergroup      32994 2016-07-27 16:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0001-1469609254627-hadoop-flowSort.jar-1469609480611-1-1-SUCCEEDED-default-1469609373636.jhist

    -rwxrwx---   1 hadoop supergroup        353 2016-07-27 16:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0001.summary

    -rwxrwx---   1 hadoop supergroup      91630 2016-07-27 16:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      32989 2016-07-27 17:01 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0002-1469609990434-hadoop-flowSort.jar-1469610090600-1-1-SUCCEEDED-default-1469610004692.jhist

    -rwxrwx---   1 hadoop supergroup        353 2016-07-27 17:01 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0002.summary

    -rwxrwx---   1 hadoop supergroup      91622 2016-07-27 17:01 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0002_conf.xml

    -rwxrwx---   1 hadoop supergroup      52581 2016-07-27 22:28 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0001-1469629441509-hadoop-flowArea.jar-1469629695512-1-0-FAILED-default-1469629461365.jhist

    -rwxrwx---   1 hadoop supergroup        352 2016-07-27 22:28 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0001.summary

    -rwxrwx---   1 hadoop supergroup      91494 2016-07-27 22:28 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      30548 2016-07-27 22:42 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0002-1469629856935-hadoop-flowArea.jar-1469630543551-1-0-FAILED-default-1469630477324.jhist

    -rwxrwx---   1 hadoop supergroup        350 2016-07-27 22:42 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0002.summary

    -rwxrwx---   1 hadoop supergroup      91494 2016-07-27 22:42 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0002_conf.xml

    -rwxrwx---   1 hadoop supergroup      30560 2016-07-27 22:55 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0003-1469630391568-hadoop-flowArea.jar-1469631307275-1-0-FAILED-default-1469631249046.jhist

    -rwxrwx---   1 hadoop supergroup        350 2016-07-27 22:55 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0003.summary

    -rwxrwx---   1 hadoop supergroup      91494 2016-07-27 22:55 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0003_conf.xml

    -rwxrwx---   1 hadoop supergroup      54558 2016-07-28 09:12 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0001-1469668063936-hadoop-flowArea.jar-1469668319036-1-0-FAILED-default-1469668087466.jhist

    -rwxrwx---   1 hadoop supergroup        352 2016-07-28 09:11 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0001.summary

    -rwxrwx---   1 hadoop supergroup      91494 2016-07-28 09:12 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      30329 2016-07-28 09:25 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0002-1469669047716-hadoop-flow.jar-1469669116225-1-0-FAILED-default-1469669070963.jhist

    -rwxrwx---   1 hadoop supergroup        346 2016-07-28 09:25 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0002.summary

    -rwxrwx---   1 hadoop supergroup      91595 2016-07-28 09:25 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0002_conf.xml

    -rwxrwx---   1 hadoop supergroup      30331 2016-07-28 09:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0003-1469669444122-hadoop-flow.jar-1469669914163-1-0-FAILED-default-1469669867080.jhist

    -rwxrwx---   1 hadoop supergroup        346 2016-07-28 09:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0003.summary

    -rwxrwx---   1 hadoop supergroup      91595 2016-07-28 09:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0003_conf.xml

    -rwxrwx---   1 hadoop supergroup      32950 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0004-1469670210160-hadoop-flow.jar-1469670688549-1-1-SUCCEEDED-default-1469670670491.jhist

    -rwxrwx---   1 hadoop supergroup        347 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0004.summary

    -rwxrwx---   1 hadoop supergroup      91619 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0004_conf.xml

    [hadoop@weekend110 ~]$

     

     file:/tmp/hadoop-Administrator/mapred/staging/Administrator1242101173/.staging/job_local1242101173_0001/job.xml

    job-id : job_local1242101173_0001uber-mode : falsemap-progress : 1.0reduce-progress : 1.0cleanup-progress : 1.0setup-progress : 1.0runstate : SUCCEEDEDstart-time : 0user-name : Administratorpriority : NORMALscheduling-info : NAnum-used-slots0num-reserved-slots0used-mem0reserved-mem0needed-mem0

    主题:

        主用户: NTUserPrincipal: Administrator

        主用户: NTSidUserPrincipal: S-1-5-21-2155837731-1039603112-1552600933-500

        主用户: NTDomainPrincipal: WIN-BQOBV63OBNM

        主用户: NTSidDomainPrincipal: S-1-5-21-2155837731-1039603112-1552600933

        主用户: NTSidPrimaryGroupPrincipal: S-1-5-21-2155837731-1039603112-1552600933-513

        主用户: NTSidGroupPrincipal: S-1-1-0

        主用户: NTSidGroupPrincipal: S-1-5-114

        主用户: NTSidGroupPrincipal: S-1-5-32-544

        主用户: NTSidGroupPrincipal: S-1-5-32-545

        主用户: NTSidGroupPrincipal: S-1-5-4

        主用户: NTSidGroupPrincipal: S-1-2-1

        主用户: NTSidGroupPrincipal: S-1-5-11

        主用户: NTSidGroupPrincipal: S-1-5-15

        主用户: NTSidGroupPrincipal: S-1-5-113

        主用户: NTSidGroupPrincipal: S-1-5-5-0-112222

        主用户: NTSidGroupPrincipal: S-1-2-0

        主用户: NTSidGroupPrincipal: S-1-5-64-10

        主用户: NTSidGroupPrincipal: S-1-16-12288

        主用户: Administrator

        公共身份证明: NTNumericCredential: 2088

        专用身份证明: org.apache.hadoop.security.Credentials@77084cb5

    以上是weekend110的textinputformat的对切片规划的源码分析

     

    建立索引

    看mr程序实现倒排索引

    Soga:

    之前在分析切片规划的源码分析时,Inputspilt里,我们知道,是包括block信息、文件路径信息、、、

     

     

     

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop jar ii.jar cn.itcast.hadoop.mr.ii.InverseIndexStepOne /ii/data /ii/stepone

     

    为什么可以,因为,

     

     

    拿这个结果,作为输入

     

    以上是weekend110的倒排索引的mr实现

     

    以下是,多个job在同一个mian方法中提交

    总结,不推荐这种哈。当然,在这里是玩玩而已

  • 相关阅读:
    线性代数思维导图——3.向量
    微分中值定理的基础题型总结
    构造函数
    Python课程笔记(七)
    0241. Different Ways to Add Parentheses (M)
    0014. Longest Common Prefix (E)
    0013. Roman to Integer (E)
    0011. Container With Most Water (M)
    0010. Regular Expression Matching (H)
    0012. Integer to Roman (M)
  • 原文地址:https://www.cnblogs.com/zlslch/p/5901865.html
Copyright © 2011-2022 走看看