zoukankan      html  css  js  c++  java
  • MapReduce 异常 LongWritable cannot be cast to Text

    有一个txt文件,内容格公式是这样的:

    深圳订做T恤	5729944
    深圳厂家t恤批发	5729945
    深圳定做文化衫	5729944
    文化衫厂家	5729944
    订做文化衫	5729944
    深圳t恤厂家	5729945


    前面是搜索关键词,后面的是所属的分类ID,以tab分隔,想统计分类情况。于是用以下的MapReduce程序跑了下:

    import java.io.IOException;
    import java.util.*;
    
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.conf.*;
    import org.apache.hadoop.io.*;
    import org.apache.hadoop.mapreduce.*;
    import org.apache.hadoop.mapreduce.lib.input.*;
    import org.apache.hadoop.mapreduce.lib.output.*;
    import org.apache.hadoop.util.*;
    
    public class  ClassCount extends Configured implements Tool
    {
    	public static class ClassMap 
    		extends Mapper<Text ,Text,Text,IntWritable>
    	{
    		private static final IntWritable one = new IntWritable(1);
    		private Text word = new Text();
    
    		public void map(Text key,Text value,Context context)
    			throws IOException,InterruptedException
    		{
    			String eachLine = value.toString();
    			StringTokenizer tokenizer = new StringTokenizer(eachLine,"
    ");
    			while(tokenizer.hasMoreTokens())
    			{
    				StringTokenizer token = new StringTokenizer(tokenizer.nextToken(),"	");
    				String keyword = token.nextToken();//i don't use it now.
    				String classId = token.nextToken();
    				word.set(classId);
    				context.write(word,one);
    			}
    		}
    	}
    
    	public static class Reduce 
    		extends Reducer<Text,IntWritable,Text,IntWritable>
    	{
    		public void reduce(Text key,Iterable<IntWritable> values,Context context)
    			throws IOException,InterruptedException
    		{
    			int sum = 0;
    			for(IntWritable val : values)
    				sum += val.get();
    			context.write(key,new IntWritable(sum));
    		}
    	}
    	public int run(String args[]) throws Exception{
    		Job job = new Job(getConf());
    		job.setJarByClass(ClassCount.class);
    		job.setJobName("classCount");
    		
    		job.setMapperClass(ClassMap.class);
    		job.setReducerClass(Reduce.class);
    		
    		job.setInputFormatClass(TextInputFormat.class);
    		job.setOutputFormatClass(TextOutputFormat.class);
    
    		FileInputFormat.setInputPaths(job,new Path(args[0]));
    		FileOutputFormat.setOutputPath(job,new Path(args[1]));
    
    		boolean success = job.waitForCompletion(true);
    		return success ?

    0 : 1; } public static void main(String[] args) throws Exception { int ret = ToolRunner.run(new ClassCount(),args); System.exit(ret); } }


    抛出例如以下异常:

    java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text


    我以为输入的键是文本就用Text来作为key,但貌似不是这样子的,map方法把文件的行号当成key,所以要用LongWritable。
    可是改过来之后,报了以下的异常:

    14/04/25 17:21:15 INFO mapred.JobClient: Task Id : attempt_201404211802_0040_m_000000_1, Status : FAILED
    java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.IntWritable
    

    这个就更加直观了,须要在run方法中加入以下的两行以明白声明输入的格式。

    	job.setMapOutputKeyClass(Text.class);
            job.setMapOutputValueClass(IntWritable.class);
    


    
        
            

    版权声明:本文博客原创文章。博客,未经同意,不得转载。

  • 相关阅读:
    操作符 Thinking in Java 第三章
    一切都是对象 Thinking in Java 第二章
    JS获取URL中参数值(QueryString)的4种方法分享<转>
    对象导论 Thinking in Java 第一章
    Thinking in Java 笔记初衷
    JSON学习总结
    Codeforces Round #506 (Div. 3)
    2020 CCPC Wannafly Winter Camp Day1
    Educational Codeforces Round 81 (Rated for Div. 2)
    数论函数前缀和合集
  • 原文地址:https://www.cnblogs.com/hrhguanli/p/4648740.html
Copyright © 2011-2022 走看看