zoukankan      html  css  js  c++  java
  • MR并行算法编程过程中遇到问题的思考

    1. Reducer 类中 reduce函数外定义的变量是在Reducer机器上属于全局变量的,因此,一台机器上reduce函数均可以对该变量的值做出贡献。如代码:(sum和count数据Reducer机器上的全局变量)‘

    	public static class AvgCalReducer extends Reducer<EntityEntityWritable,FloatWritable,EntityEntityWritable,FloatWritable>
    	{
    		FloatWritable avg;
    		float sum=0;
    		int count=0;		
    		public void reduce(EntityEntityWritable key,Iterable<FloatWritable>values,Context context) throws IOException, InterruptedException
    		{
    
    			System.out.println("reducer starting:");
    			for (FloatWritable value:values)
    			{
    				sum=sum+value.get();
    				count++;
    				System.out.println(" key = "+key+" value = "+value.get());
    			}
    			System.out.println("average:"+sum/count);
    			System.out.println("this reducer ending...");
    			avg=new FloatWritable(sum/count);
    			context.write(key, avg);
    		}
    	}

    如果想使sum和count的值仅通过reduce函数进行改变,即只计算同一个key对应value的sum和count,则需要将sum和count放入reduce函数内,如下:

    	public static class AvgCalReducer extends Reducer<EntityEntityWritable,FloatWritable,EntityEntityWritable,FloatWritable>
    	{
    		FloatWritable avg;
    		
    		public void reduce(EntityEntityWritable key,Iterable<FloatWritable>values,Context context) throws IOException, InterruptedException
    		{
    			float sum=0;
    			int count=0;
    			System.out.println("reducer starting:");
    			for (FloatWritable value:values)
    			{
    				sum=sum+value.get();
    				count++;
    				System.out.println(" key = "+key+" value = "+value.get());
    			}
    			System.out.println("average:"+sum/count);
    			System.out.println("this reducer ending...");
    			avg=new FloatWritable(sum/count);
    			context.write(key, avg);
    		}
    	}

    2. 对于顺序组合式MapReduce作业:用两个job举例:

    		Configuration conf1=new Configuration();
    		Job job1=new Job(conf1,"Job1");
    		job1.waitForCompletion(true);
    
    		Configuration conf2=new Configuration();
    		Job job2=new Job(conf2,"Job2");
    		job2.waitForCompletion(true);

    注意我们之前经常写的System.exit(job.waitForCompletion(true)?0:1)在这里不可以使用,比如第一个job处的(job1.waitForCompletion(true)改成System.exit(job.waitForCompletion(true)?0:1),则系统成功完成job1后正常退出系统,没有机会再去运行job2了。


  • 相关阅读:
    深度谈谈单例模式
    高并发的下的数据库设计
    D3开发中的资料整理
    IIS配置过程中的常见问题
    css3常用动画+动画库
    非常实用的10款网站数据实时分析工具
    Jquery中AJAX参数详细介绍
    首次使用Vue开发
    js 鼠标拖拽元素
    Oracle涂抹oracle学习笔记第10章Data Guard说,我就是备份
  • 原文地址:https://www.cnblogs.com/eva_sj/p/3971164.html
Copyright © 2011-2022 走看看