zoukankan      html  css  js  c++  java
  • 《Hadoop实战》之链接多个MapReduce作业

    顺序链接MapReduce作业

    形如:mapreduce-1 | mapreduce-2 | mapreduce-3

    • 在run函数中,继续写新的job,再通过JobClient.runJob()进行调用
    @Override
    public int run(String[] args) throws Exception {
    	JobConf job1 = new JobConf(getConf(), getClass());
    	JobClient.runJob(job1);
    	
    	JobConf job2 = new JobConf(getConf(), getClass());
    	JobClient.runJob(job2);
    }
    

    具有复杂依赖的MapReduce链接

    • 通过Job和JobControl类来管理
    // 对于Job对象x和y
    x.addDependingJob(y)	// 添加依赖关系:在y完成之前,x不会启动
    
    jobControl.addJob(x)	// Job对象x,y 由JobControl对象管理
    jobControl.addJob(y)	
    
    
    jobControl.allFinished()	//JobControl对象的监视方法
    jobControl.getFailedJobs()
    

    预处理和后处理的链接

    形如:Map+ | REDUCE | MAP*

    • ChainMapper/ChainReducer:减少输出的中间结果

    • addMapper/setReducer接口

      • job、mapperConf:全局和本地JobConf对象
      • kclass:Mapper类
      • 输入输出类的类型
      • byValue:MapOutputKey跟MapOutputValue是否采用值传递的方式
        • true:值传递
        • false:引用传递
    public static <K1, V1, K2, V2> void 
    						addMapper(JobConf job,
    								  Class<? extends Mapper<K1, V1, K2, V2>> kclass,
    								  Class<? extends K1> inputKeyClass,
    								  Class<? extends V1> inputValueClass,
    								  Class<? extends K2> outputKeyClass,
    								  Class<? extends V2> outputValueClass,
    								  boolean byValue,
    								  JobConf mapperConf)
    
    例:具有预处理和后处理的MapReduce Driver
    • Map1 | Map2 | Reduce | Map3 | Map4
      • ChainMapper.addMapper:添加Reduce前所有步骤
      • ChainReducer.addMapper:后续步骤
      • 本地JobConf对象具有更高优先级
        @Override
        public int run(String[] args) throws Exception {
            JobConf job = new JobConf(getConf(), getClass());
    
            job.setJobName("ChainJob");
            job.setInputFormat(TextInputFormat.class);
            job.setOutputFormat(TextOutputFormat.class);
    
            JobConf map1Conf = new JobConf(false);  // loadDefaults=false,生成本地配置对象
            ChainMapper.addMapper(job, Map1.class, LongWritable.class, Text.class,
                    Text.class, Text.class, true, map1Conf);
            JobConf map2Conf = new JobConf(false);
            ChainMapper.addMapper(job, Map2.class, Text.class, Text.class,
                    LongWritable.class, Text.class, true, map2Conf);
    
            JobConf reduceConf = new JobConf(false);    
            ChainReducer.setReducer(job, ReducerClass.class, LongWritable.class, Text.class,
                    Text.class, Text.class, true, reduceConf);
    
            JobConf map3Conf = new JobConf(false);
            ChainReducer.addMapper(job, Map3.class, Text.class, Text.class,
                    LongWritable.class, Text.class, true, map3Conf);
            JobConf map4Conf = new JobConf(false);
            ChainReducer.addMapper(job, Map4.class, LongWritable.class, Text.class,
                    LongWritable.class, Text.class, true, map4Conf);
            
            JobClient.runJob(job);
            return 0;
        }
    
  • 相关阅读:
    Smarty学习笔记(二)
    Smarty学习笔记(一)
    MVC学习笔记(一)
    2015羊年主流手机配置什么样?
    FPGA学习笔记(一)Verilog语法基础
    FPGA学习笔记(二)模块建立及变量连接
    STM32学习笔记(一)时钟和定时器
    Win8 HTML5与JS编程学习笔记(一)
    Win8 HTML5与JS编程学习笔记(二)
    LUOGU P2831 愤怒的小鸟 (NOIP 2016)
  • 原文地址:https://www.cnblogs.com/vvlj/p/14101858.html
Copyright © 2011-2022 走看看