zoukankan      html  css  js  c++  java
  • hadoop实验:求气象数据的最低温度


    1.下载部分数据。由于实验就仅仅下载2003年的部分气象数据

    2.通过zcat *gz > sample.txt命令解压重定向

    [hadoop@Master test_data]$ zcat *gz > /home/hadoop/input/sample.txt

    3.查看数据格式

    4.把文件sample.txt放进hdfs文件系统里

    [hadoop@Master input]$ hadoop fs -put /home/hadoop/input/sample.txt  /user/hadoop/in/sample.txt

    5.Maper : MinTemperatureMapper.java


     import java.io.IOException;
     import org.apache.hadoop.io.IntWritable;
     import org.apache.hadoop.io.LongWritable;
     import org.apache.hadoop.io.Text;
     import org.apache.hadoop.mapreduce.Mapper;
    
     public class MinTemperatureMapper
       extends Mapper<LongWritable, Text, Text, IntWritable>
     {
    
       private static final int MISSING = -9999;
    
       @Override
       public void map(LongWritable key, Text value, Context context)
             throws IOException, InterruptedException{
    
         String line = value.toString();
         String year = line.substring(0,4);
         int airTemperature;
         airTemperature= Integer.parseInt(line.substring(14, 19).trim());
    
         if (airTemperature!= MISSING) {
         context.write(new Text(year), new IntWritable(airTemperature));
         }
       }
    

    6.Reducer :MinTemperatureReducer.java

    import java.io.IOException;
    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapreduce.Reducer;
    
    public class MinTemperatureReducer
      extends Reducer<Text, IntWritable, Text, IntWritable>
    {
    
      @Override
      public void reduce(Text key, Iterable<IntWritable> values,Context context)
              throws IOException, InterruptedException
            {
    
                    int minValue= Integer.MAX_VALUE;
                    for (IntWritable value : values)
                    {
                            minValue= Math.min(minValue, value.get());
                    }
                    context.write(key, new IntWritable(minValue));
            }
    }
    


    7.M-R Job :MinTemperature.java

    import org.apache.hadoop.mapreduce.Job;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    
    public class MinTemperature
    {
            public static void main(String[] args) throws Exception
            {
                    if (args.length!= 2)
                    {
                            System.err.println("Usage: MinTemperature<input path> <output path>");
                            System.exit(-1);
                    }
                    Job job= new Job();
                    job.setJarByClass(MinTemperature.class);
                    job.setJobName("Min temperature");
                    FileInputFormat.addInputPath(job, new Path(args[0]));
                    FileOutputFormat.setOutputPath(job, new Path(args[1]));
                    job.setMapperClass(MinTemperatureMapper.class);
                    job.setReducerClass(MinTemperatureReducer.class);
                    job.setOutputKeyClass(Text.class);
                    job.setOutputValueClass(IntWritable.class);
                    System.exit(job.waitForCompletion(true) ? 0 : 1);
            }
    }
    


    8.编译,压缩成jar 包


    [hadoop@Master myclass]$ javac -classpath /usr/hadoop/hadoop-core-1.2.1.jar  MinTemperature*.java


    [hadoop@Master myclass]$ jar cvf MinTemperature.jar MinTemperature*.class
    added manifest
    adding: MinTemperature.class(in = 1417) (out= 799)(deflated 43%)
    adding: MinTemperatureMapper.class(in = 1740) (out= 722)(deflated 58%)
    adding: MinTemperatureReducer.class(in = 1664) (out= 707)(deflated 57%)


    9.运行作业

    [hadoop@Master myclass]$ hadoop jar /usr/hadoop/myclass/MinTemperature.jar MinTemperature  /user/hadoop/in/sample.txt  ./out2


    运行报错。发现报错,信息例如以下



    找了半天原因。发现是没删掉class ,程序找不到类。在myclass 文件下删掉class文件。仅仅保留生成的jar包

    [hadoop@Master myclass]$ rm MinTemperature*.class


    10.查看结果









    
  • 相关阅读:
    原型prototype
    this
    作用域、闭包、模块
    嵌入式面试资料
    一些嵌入式面试题目的集锦
    优先级反转
    struct和union的区别
    (转)typedef和#define的用法与区别
    const 和 #define区别
    白话经典算法系列之 快速排序 快速搞定
  • 原文地址:https://www.cnblogs.com/yjbjingcha/p/7147395.html
Copyright © 2011-2022 走看看