zoukankan      html  css  js  c++  java
  • 【Hadoop】Hive HSQ 使用 && 自定义HQL函数

    4 HQL
    4.1 官网
    4.1.1 https://cwiki.apache.org/confluence/display/Hive/LanguageManual
    4.1.2 性能调优
    4.1.2.1 Explain Execution Plan
    4.1.2.2 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain
    4.2 示例
    4.2.1 增加分区
    4.2.1.1 alter table tab_ip change id id_alter string; ALTER TABLE tab_cts ADD PARTITION (partCol = 'dt') location '/external/hive/dt';
    4.2.2 结果写到HDFS
    4.2.2.1 insert overwrite local directory './hivetest/' select * from tbl_order_pt where month='201602'; insert overwrite directory '/hivetest/' select * from tbl_order_pt where month='201602';
    4.2.3 集合类型字段
    4.2.3.1 ARRAY
    4.2.3.1.1 create table tab_array(a array<int>,b array<string>) row format delimited fields terminated by ' ' collection items terminated by ','; //示例数据 tobenbrone,laihama,woshishui 13866987898,13287654321 abc,iloveyou,itcast 13866987898,13287654321 //操作 select a[0] from tab_array; select * from tab_array where array_contains(b,'word'); insert into table tab_array select array(0),array(name,ip) from tab_ext t;
    4.2.3.2 MAP
    4.2.3.2.1 create table tab_map(name string,info map<string,string>) row format delimited fields terminated by ' ' collection items terminated by ';' map keys terminated by ':'; //示例数据: fengjie age:18;size:36A;addr:usa furong age:28;size:39C;addr:beijing;weight:180KG //操作 load data local inpath '/home/hadoop/hivetemp/tab_map.txt' overwrite into table tab_map; insert into table tab_map select name,map('name',name,'ip',ip) from tab_ext;
    4.2.3.3 STRUCT
    4.2.3.3.1 create table tab_struct(name string,info struct<age:int,tel:string,addr:string>) row format delimited fields terminated by ' ' collection items terminated by ',' //操作 load data local inpath '/home/hadoop/hivetemp/tab_st.txt' overwrite into table tab_struct; insert into table tab_struct select name,named_struct('age',id,'tel',name,'addr',country) from tab_ext;
    4.2.4 自定义函数
    4.2.4.1 select if(id=1,first,no-first),name from tab_ext; hive>add jar /home/hadoop/myudf.jar; hive>CREATE TEMPORARY FUNCTION my_lower AS 'org.dht.Lower'; select my_upper(name) from tab_ext;
    4.2.5 高级查询
    4.2.5.1 select * from tbl_order_pt where month='201602' sort by id desc limit 0,3; select a.ip,b.book from tab_ext a join tab_ip_book b on(a.name=b.name);
    4.2.6 HQL CLI
    4.2.6.1 hive -S -e 'select country,count(*) from tab_ext' > /home/hadoop/hivetemp/e.txt 有了这种执行机制,就使得我们可以利用脚本语言(bash shell,python)进行hql语句的批量执行,示例如下: hive -S -e 'use db_order;select * from tbl_order_pt where month='201602' sort by id desc limit 0,3;' > result.txt
     

    自定义函数示例代码:

    package cn.itcast.bigdata;
    
    import java.util.HashMap;
    
    import org.apache.hadoop.hive.ql.exec.UDF;
    
    
    public class PhoneNbrToArea extends UDF{
    
        private static HashMap<String, String> areaMap = new HashMap<>();
        static {
            areaMap.put("1388", "beijing");
            areaMap.put("1399", "tianjin");
            areaMap.put("1366", "nanjing");
        }
        
        //一定要用public修饰才能被hive调用
        public String evaluate(String pnb) {
            
            String result  = areaMap.get(pnb.substring(0,4))==null? (pnb+"    huoxing"):(pnb+"  "+areaMap.get(pnb.substring(0,4)));        
            
            return result;
        }
        
    }
  • 相关阅读:
    python函数篇
    字符编码和文件处理
    对话代码
    复习2
    [转]借闪光灯的东风 成就你完美的摄影作品
    色系
    Oracle的一些基本操作
    iebook line flash
    网站收录
    复习1
  • 原文地址:https://www.cnblogs.com/junneyang/p/5896392.html
Copyright © 2011-2022 走看看