zoukankan html css js c++ java

Hadoop2.6.0子项目hadoop-mapreduce-examples的简介

引文

学习Hadoop的同学们，一定知道假设执行Hadoop自带的各种样例，以大名鼎鼎的wordcount为例，你会输入下面命令：

hadoop org.apache.hadoop.examples.WordCount -D mapreduce.input.fileinputformat.split.maxsize=1 /wordcount/input /wordcount/output/result1

当然。有些人还会用下面替代方式：

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /wordcount/input /wordcount/output/result1

相比于原始的运行方式，使用jar命令方式。让我们不用再敲入繁琐的完整包路径。比方我们知道hadoop-mapreduce-examples项目中还提供了其他的样例，比方计算圆周率的样例，我们仅仅须要记住此应用的简单名字pi，就能够运行它：

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 5 10

虽说我们仅仅是使用这些现成的样例，没有必要较真，可是这样的简洁的使用方式，无疑还是值得借鉴的。本文将分析下这样的方式实现的原理，有兴趣的同学能够一读。

源代码分析

这一节，我们通过对hadoop-mapreduce-examples项目中的关键源代码进行分析。理解简洁运行的原理。在hadoop-mapreduce-examples项目的pom.xml文件里配置了org.apache.hadoop.examples.ExampleDriver作为jar命令的入口。配置例如以下：

   <plugin>
    <groupId>org.apache.maven.plugins</groupId>
     <artifactId>maven-jar-plugin</artifactId>
      <configuration>
       <archive>
         <manifest>
           <mainClass>org.apache.hadoop.examples.ExampleDriver</mainClass>
         </manifest>
       </archive>
     </configuration>
    </plugin>

这决定了使用jar命令运行hadoop-mapreduce-examples-2.6.0.jar包时。实际运行了ExampleDriver的main方法，ExampleDriver的实现例如以下：

public class ExampleDriver {
  
  public static void main(String argv[]){
    int exitCode = -1;
    ProgramDriver pgd = new ProgramDriver();
    try {
      pgd.addClass("wordcount", WordCount.class, 
                   "A map/reduce program that counts the words in the input files.");
      // 省略其他样例的注冊代码
      pgd.addClass("pi", QuasiMonteCarlo.class, QuasiMonteCarlo.DESCRIPTION);
      // 省略其他样例的注冊代码
      exitCode = pgd.run(argv);
    }
    catch(Throwable e){
      e.printStackTrace();
    }
    
    System.exit(exitCode);
  }
}

以上代码构造了ProgramDriver的实例。而且调用其addClass方法，三个參数各自是样例名称（如wordcount、pi等）、样例的实现Class、样例的描写叙述信息。ProgramDriver的addClass方法的实现例如以下：

  public void addClass(String name, Class<?
> mainClass, String description)
      throws Throwable {
    programs.put(name , new ProgramDescription(mainClass, description));
  }

首先，构造ProgramDescription对象。其构造函数例如以下：

    public ProgramDescription(Class<?> mainClass, 
                              String description)
      throws SecurityException, NoSuchMethodException {
      this.main = mainClass.getMethod("main", paramTypes);
      this.description = description;
    }

当中main的类型是java.lang.reflect.Method。用于保存样例Class的main方法。
然后。将样例名称（如wordcount、pi等）和ProgramDescription实例注冊到programs中。programs的类型定义例如以下：

  /**
   * A description of a program based on its class and a 
   * human-readable description.
   */
  Map<String, ProgramDescription> programs;

ExampleDriver的main方法在最后会调用ProgramDriver的run方法。事实上现例如以下：

  public int run(String[] args)
    throws Throwable 
  {
    // Make sure they gave us a program name.
    if (args.length == 0) {
      System.out.println("An example program must be given as the" + 
                         " first argument.");
      printUsage(programs);
      return -1;
    }
	
    // And that it is good.
    ProgramDescription pgm = programs.get(args[0]);
    if (pgm == null) {
      System.out.println("Unknown program '" + args[0] + "' chosen.");
      printUsage(programs);
      return -1;
    }
	
    // Remove the leading argument and call main
    String[] new_args = new String[args.length - 1];
    for(int i=1; i < args.length; ++i) {
      new_args[i-1] = args[i];
    }
    pgm.invoke(new_args);
    return 0;
  }

ProgramDriver的run方法运行的过程例如以下：

參数长度校验；
依据第一个參数，从programs中查找相应的ProgramDescription实例。
将其余的參数传递给ProgramDescription的invoke方法。进而运行相应的样例。

ProgramDescription的invoke方法的实现例如以下：

    public void invoke(String[] args)
      throws Throwable {
      try {
        main.invoke(null, new Object[]{args});
      } catch (InvocationTargetException except) {
        throw except.getCause();
      }
    }

由此我们知道详细样例的运行，是通过反射调用详细样例Class的main方法，终于实现的。

后记：个人总结整理的《深入理解Spark：核心思想与源代码分析》一书如今已经正式出版上市。眼下京东、当当、天猫等站点均有销售。欢迎感兴趣的同学购买。

京东(现有满150减50活动）)：http://item.jd.com/11846120.html

当当：http://product.dangdang.com/23838168.html

查看全文

相关阅读:
[LeetCode]1290. 二进制链表转整数
 [LeetCode]1295. 统计位数为偶数的数字
 map 用法拿到map数组每一个数据
 父子组件相互传参
 父组件给子组件传参 el-dialog 试例
 如何用JS判断div中内容为空，当为空时隐藏div
完整的Vue+element-ui table组件实现表格内容的编辑删除和新行添加小实例
 Git操作
 charles的使用
 移动端的一些问题

原文地址：https://www.cnblogs.com/slgkaifa/p/7242868.html