zoukankan      html  css  js  c++  java
  • Hadoop_UDTF示例

    UDTF: 一进多出

    UDTF(User-Defined Table-Generating Function)支持一个输入多个输出,
    一般用于解析工作,比如说解析url,然后获取url中的信息
    编码:继承GenericUDTF,实现方法:initializa(返回返回值的参数类型)、process具体的处理方法,
       一般在这个方法中会调用父类的forward方法进行数据的写出、最终调用close方法和MR程序中的cleanUp关闭资源
    

    简单示例,将一列数据分成两列输出,name--> name,name+email

    package com.hive.udtf;
    
    import java.util.ArrayList;
    
    import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
    import org.apache.hadoop.hive.ql.metadata.HiveException;
    import org.apache.hadoop.hive.ql.udf.generic.GenericUDTF;
    import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
    import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
    import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
    import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
    
    public class myudtf extends GenericUDTF{
    	
      @Override
      public StructObjectInspector initialize(StructObjectInspector argOIs) throws UDFArgumentException {
    		
        if(argOIs.getAllStructFieldRefs().size() != 1){
          throw new UDFArgumentException("Argument Only one");
        }
    		
        ArrayList<String> fieldname = new ArrayList<String>();
          fieldname.add("name");
          fieldname.add("email");
          ArrayList<ObjectInspector> fieldoi = new ArrayList<ObjectInspector>();
          fieldoi.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);
          fieldoi.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);
          return ObjectInspectorFactory.getStandardStructObjectInspector(fieldname, fieldoi);
        }
    
        @Override
        public void process(Object[] args) throws HiveException {
    		
          if(args.length == 1){
            String name = args[0].toString();
            String email = name+"@foxmail.com";
            super.forward(new String[]{name,email});
          }
        }
    
        @Override
        public void close() throws HiveException {
    		
          super.forward(new String[] {"complete","finish"});
        }
    }
    

     测试

    hive (workdb)> add jar /home/liuwl/opt/datas/myudtf.jar;  
    hive (workdb)> create temporary function myudtf as 'com.hive.udtf.myudtf';
    hive (workdb)> select myudtf(ename) as (name,email) from emp;
    结果: name   email
        SMITH  SMITH@foxmail.com
        ALLEN  ALLEN@foxmail.com
        WARD   WARD@foxmail.com
        JONES  JONES@foxmail.com
        MARTIN  MARTIN@foxmail.com
        BLAKE  BLAKE@foxmail.com
        CLARK  CLARK@foxmail.com
        SCOTT  SCOTT@foxmail.com
        KING   KING@foxmail.com
        TURNER  TURNER@foxmail.com
        ADAMS  ADAMS@foxmail.com
        JAMES  JAMES@foxmail.com
        FORD   FORD@foxmail.com
        MILLER  MILLER@foxmail.com
        complete	finish
    
  • 相关阅读:
    HDU 2236 无题Ⅱ
    Golden Tiger Claw(二分图)
    HDU 5969 最大的位或 (思维,贪心)
    HDU 3686 Traffic Real Time Query System (图论)
    SCOI 2016 萌萌哒
    Spring Boot支持控制台Banner定制
    构建第一个Spring Boot程序
    Spring Boot重要模块
    Java fastjson JSON和String互相转换
    BCompare 4 Windows激活方法【试用期30天重置】
  • 原文地址:https://www.cnblogs.com/eRrsr/p/6097034.html
Copyright © 2011-2022 走看看