zoukankan      html  css  js  c++  java
  • 【hadoop】在eclipse上运行WordCount的操作过程

    序:本以为今天花点时间将WordCount例子完全理解到,但高估自己了,更别说我只是在大学选修一学期的java,之后再也没碰过java语言了

    总的来说,从宏观上能理解具体的程序思路,但具体到每个代码有什么作用,什么原理,那还需要花点时间,毕竟需要一点java基础和hadoop的运行机制的知识

    首先启动hadoop;

    [hadoop@hadoop01 eclipse]$ cd ~/hadoop-3.2.0
    [hadoop@hadoop01 hadoop-3.2.0]$ sbin/start-all.sh
    WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
    WARNING: This is not a recommended production deployment configuration.
    WARNING: Use CTRL-C to abort.
    Starting namenodes on [hadoop01]
    Starting datanodes
    Starting secondary namenodes [hadoop01]
    Starting resourcemanager
    Starting nodemanagers
    [hadoop@hadoop01 hadoop-3.2.0]$ jps
    8497 NameNode
    9121 ResourceManager
    8868 SecondaryNameNode
    9268 NodeManager
    9630 Jps



    然后,进入root权限打开eclipse;

    [hadoop@hadoop01 hadoop-3.2.0]$ su root
    Password:
    [root@hadoop01 hadoop-3.2.0]# cd ..
    [root@hadoop01 hadoop]# cd eclipse
    [root@hadoop01 eclipse]# ./eclipse


    在eclipse的window里面show view打开terminal;

    在eclipse中点击打开open a terminal,在终端中输入命令:gedit input.txt;

    在文档中任意输入内容;

    在终端中输入命令:hadoop fs -put /home/hadoop/input.txt /test/;

    最后,file--new--project--MapReduce project并取项目名“Wordcount”,再从创建的文件下src中new--package并为包取名“com.hadoop”,又在src下new--class并为类取名“Wordcount”,然后将下面的代码粘贴进去。

    然后可以run as hadoop,成功运行得到计算结果

    注:若package下无log4j.properties,会报错,需在该文件下手动添加该文件。
    内容 如下:

    # Configure logging for testing: optionally with log file
     
    #log4j.rootLogger=debug,appender
    log4j.rootLogger=info,appender
    #log4j.rootLogger=error,appender
     
    #u8F93u51FAu5230u63A7u5236u53F0
    log4j.appender.appender=org.apache.log4j.ConsoleAppender
    #u6837u5F0Fu4E3ATTCCLayout
    log4j.appender.appender.layout=org.apache.log4j.TTCCLayout


    附代码:

    /**
     * Licensed to the Apache Software Foundation (ASF) under one
     * or more contributor license agreements.  See the NOTICE file
     * distributed with this work for additional information
     * regarding copyright ownership.  The ASF licenses this file
     * to you under the Apache License, Version 2.0 (the
     * "License"); you may not use this file except in compliance
     * with the License.  You may obtain a copy of the License at
     *
     *     http://www.apache.org/licenses/LICENSE-2.0
     *
     * Unless required by applicable law or agreed to in writing, software
     * distributed under the License is distributed on an "AS IS" BASIS,
     * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     * See the License for the specific language governing permissions and
     * limitations under the License.
     */
    package org.apache.hadoop.examples;
    
    import java.io.IOException;
    import java.util.StringTokenizer;
    
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapreduce.Job;
    import org.apache.hadoop.mapreduce.Mapper;
    import org.apache.hadoop.mapreduce.Reducer;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    import org.apache.hadoop.util.GenericOptionsParser;
    
    public class WordCount {
    
      public static class TokenizerMapper
           extends Mapper<Object, Text, Text, IntWritable>{
        
        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();
          
        public void map(Object key, Text value, Context context
                        ) throws IOException, InterruptedException {
          StringTokenizer itr = new StringTokenizer(value.toString());
          while (itr.hasMoreTokens()) {
            word.set(itr.nextToken());
            context.write(word, one);
          }
        }
      }
     
      public static class IntSumReducer
           extends Reducer<Text,IntWritable,Text,IntWritable> {
        private IntWritable result = new IntWritable();
    
        public void reduce(Text key, Iterable<IntWritable> values,
                           Context context
                           ) throws IOException, InterruptedException {
          int sum = 0;
          for (IntWritable val : values) {
            sum += val.get();
          }
          result.set(sum);
          context.write(key, result);
        }
      }
    
      public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
        if (otherArgs.length < 2) {
          System.err.println("Usage: wordcount <in> [<in>...] <out>");
          System.exit(2);
        }
        Job job = Job.getInstance(conf, "word count");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(TokenizerMapper.class);
        job.setCombinerClass(IntSumReducer.class);
        job.setReducerClass(IntSumReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        for (int i = 0; i < otherArgs.length - 1; ++i) {
          FileInputFormat.addInputPath(job, new Path(otherArgs[i]));
        }
        FileOutputFormat.setOutputPath(job,
          new Path(otherArgs[otherArgs.length - 1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
      }
    }
  • 相关阅读:
    [USACO08FEB]酒店Hotel 线段树 BZOJ 1593
    有趣的数 zoj 月赛
    [ZJOI2008]生日聚会 BZOJ1037 dp
    借教室 差分+二分答案
    HackerRank
    旅行计划 记忆化搜索
    灾后重建 Floyd
    [USACO10OCT]湖计数Lake Counting 联通块
    [TJOI2013]循环格 费用流 BZOJ3171
    高斯消元
  • 原文地址:https://www.cnblogs.com/CQ-LQJ/p/11478980.html
Copyright © 2011-2022 走看看