Hbase访问方式之Mapreduce

zoukankan html css js c++ java

Hbase访问方式之Mapreduce
Hbase对Mapreduce API进行了扩展，方便Mapreduce任务读写HTable数据。

一个简单示例：

说明：从日志表中，统计每个IP访问网站目录的总数
[java] view plain copy

package man.ludq.hbase;



import java.io.IOException;



import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.client.Result;

import org.apache.hadoop.hbase.client.Scan;

import org.apache.hadoop.hbase.io.ImmutableBytesWritable;

import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;

import org.apache.hadoop.hbase.mapreduce.TableMapper;

import org.apache.hadoop.hbase.mapreduce.TableReducer;

import org.apache.hadoop.hbase.util.Bytes;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;



public class ExampleTotalMapReduce{

    public static void main(String[] args) {

        try{

            Configuration config = HBaseConfiguration.create();

            Job job = new Job(config,"ExampleSummary");

            job.setJarByClass(ExampleTotalMapReduce.class);     // class that contains mapper and reducer



            Scan scan = new Scan();

            scan.setCaching(500);        // 1 is the default in Scan, which will be bad for MapReduce jobs

            scan.setCacheBlocks(false);  // don't set to true for MR jobs

            // set other scan attrs

            //scan.addColumn(family, qualifier);

            TableMapReduceUtil.initTableMapperJob(

                    "access-log",        // input table

                    scan,               // Scan instance to control CF and attribute selection

                    MyMapper.class,     // mapper class

                    Text.class,         // mapper output key

                    IntWritable.class,  // mapper output value

                    job);

            TableMapReduceUtil.initTableReducerJob(

                    "total-access",        // output table

                    MyTableReducer.class,    // reducer class

                    job);

            job.setNumReduceTasks(1);   // at least one, adjust as required



            boolean b = job.waitForCompletion(true);

            if (!b) {

                throw new IOException("error with job!");

            }

        } catch(Exception e){

            e.printStackTrace();

        }

    }



    public static class MyMapper extends TableMapper<Text, IntWritable>  {



        private final IntWritable ONE = new IntWritable(1);

        private Text text = new Text();



        public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {

            String ip = Bytes.toString(row.get()).split("-")[0];

            String url = new String(value.getValue(Bytes.toBytes("info"), Bytes.toBytes("url")));

            text.set(ip+"&"+url);

            context.write(text, ONE);

        }

    }



    public static class MyTableReducer extends TableReducer<Text, IntWritable, ImmutableBytesWritable>  {

        public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {

            int sum = 0;

            for (IntWritable val : values) {

                sum += val.get();

            }



            Put put = new Put(key.getBytes());

            put.add(Bytes.toBytes("info"), Bytes.toBytes("count"), Bytes.toBytes(String.valueOf(sum)));



            context.write(null, put);

        }

    }

}
参考文档：

1、Mapreduce读取和写入Hbase（从A表读取数据，统计结果放入B表，非常详细，附有代码说明以及流程）

http://sujee.net/tech/articles/hadoop/hbase-map-reduce-freq-counter/

2、Mapreduce操作Hbase（官方文档，包括读/读写/多表输出/输出到文件/输出到RDBMS/Job中访问其他的HBase Tables）

http://abloz.com/hbase/book.html#mapreduce.example
查看全文

相关阅读:
pyqt5 Button.click 报错：argument 1 has unexpected type 'NoneType'
numpy 数组相减
 python 神经网络包 NeuroLab
xgboost 和GBDT的区别
 stacking
GBDT
bp神经网络
 dataframe.isnull()函数， DatetimeIndex，黄包车代码155行
 【Linux】相关概念以及常用命令
 【Hive】优化策略

原文地址：https://www.cnblogs.com/bluecoder/p/3824265.html

Hbase访问方式之Mapreduce

一个简单示例：

参考文档：