zoukankan      html  css  js  c++  java
  • hive_非常用

    1.count(distinct xxx) 窗口函数    

    hive 2.x版本支持:

      count(distinct cust_num) over(partition by xxx order by xxx)     -- 分组内去重求和

    hive1.x版本不支持:改版

     size(collect_set(cust_num) over(partition by xxx order by xxx)) -- 分组内去重求和

     2.

    collect_set : set集合,没有重复元素

    collect_list :list列表,可以有重复元素

    select collect_list(value)
    from 
    (
    select 1 as id,1 as value from dual 
    union all 
    select 1 as id,3 as value from dual 
    union all 
    select 1 as id,2 as value from dual 
    union all 
    select 1 as id,2 as value from dual 
    ) t 
    group by id;

    [1,3,2,2]
    select collect_set(value)
    from 
    (
    select 1 as id,1 as value from dual 
    union all 
    select 1 as id,3 as value from dual 
    union all 
    select 1 as id,2 as value from dual 
    union all 
    select 1 as id,2 as value from dual 
    ) t 
    group by id;
    
    [1,3,2]

    3.排序

    sort_array

    select sort_array(collect_set(value))
    from 
    (
    select 1 as id,1 as value from dual 
    union all 
    select 1 as id,3 as value from dual 
    union all 
    select 1 as id,2 as value from dual 
    union all 
    select 1 as id,2 as value from dual 
    ) t 
    group by id;
    
    [1,2,3]

    4.集合元素连接:

    select concat_ws('-','1','2','3')
    1-2-3
    select concat_ws('-',collect_set(cast(value as string)))
    from 
    (
    select 1 as id,1 as value from dual 
    union all 
    select 1 as id,3 as value from dual 
    union all 
    select 1 as id,2 as value from dual 
    union all 
    select 1 as id,2 as value from dual 
    ) t 
    group by id;
    
    1-3-2
  • 相关阅读:
    CentOS 7 修改国内yum源
    k8s 安装
    python2 python3同时安装了scrapy如何区分调用
    scrapy log 设置
    hello django
    linux 分割大文件
    scrapy 对不同的Item进行分开存储
    纯C实现的一套low b 贪吃蛇(娱乐版)
    Python之如何实现一行输入多个值
    HDU2571:命运(DP)
  • 原文地址:https://www.cnblogs.com/yin-fei/p/11102514.html
Copyright © 2011-2022 走看看