zoukankan      html  css  js  c++  java
  • hive lateral view语句(翻译自Hive wiki)

    Lateral View语法

    lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)*
    fromClause: FROM baseTable (lateralView)*

    描述

    lateral view用于和split, explode等UDTF一起使用,它能够将一行数据拆成多行数据,在此基础上可以对拆分后的数据进行聚合。lateral view首先为原始表的每行调用UDTF,UTDF会把一行拆分成一或者多行,lateral view再把结果组合,产生一个支持别名表的虚拟表。

    例子

    假设我们有一张表pageAds,它有两列数据,第一列是pageid string,第二列是adid_list,即用逗号分隔的广告ID集合:

    string pageid Array<int> adid_list
    "front_page" [1, 2, 3]
    "contact_page" [3, 4, 5]

    要统计所有广告ID在所有页面中出现的次数。

    首先分拆广告ID:

    SELECT pageid, adid 
        FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid;

    执行结果如下:

    string pageid int adid
    "front_page" 1
    "front_page" 2
    "front_page" 3
    "contact_page" 3
    "contact_page" 4
    "contact_page" 5

    接下来就是一个聚合的统计:

    SELECT adid, count(1) 
        FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid
    GROUP BY adid;

    执行结果如下:

    int adid count(1)
    1 1
    2 1
    3 2
    4 1
    5 1

     

     

     

     

     

    多个lateral view语句

    一个FROM语句后可以跟多个lateral view语句,后面的lateral view语句能够引用它前面的所有表和列名。 以下面的表为例:

    Array<int> col1 Array<string> col2
    [1, 2] [a", "b", "c"]
    [3, 4] [d", "e", "f"]
    
    

     

     

     

    SELECT myCol1, col2 FROM baseTable
        LATERAL VIEW explode(col1) myTable1 AS myCol1;

    执行结果为:

    int mycol1 Array<string> col2
    1 [a", "b", "c"]
    2 [a", "b", "c"]
    3 [d", "e", "f"]
    4 [d", "e", "f"]

     

     

     

     


    加上一个lateral view:

    SELECT myCol1, myCol2 FROM baseTable
        LATERAL VIEW explode(col1) myTable1 AS myCol1
        LATERAL VIEW explode(col2) myTable2 AS myCol2;


    它的执行结果为:

    int myCol1 string myCol2
    1 "a"
    1 "b"
    1 "c"
    2 "a"
    2 "b"
    2 "c"
    3 "d"
    3 "e"
    3 "f"
    4 "d"
    4 "e"
    4 "f"














    注意上面语句中,两个lateral view按照出现的次序被执行。

    转自 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#

          http://blog.csdn.net/inte_sleeper/article/details/7196114

           

  • 相关阅读:
    理解事件驱动select,poll,epoll三种模型
    谈谈对线程与进程的理解
    5-3.首行缩进
    5-2.行高
    5-1.字间距
    4-6.字体样式重置
    4-5.字体风格
    4-4.字体粗细
    4-3.字体颜色设置
    4-2.字体设置
  • 原文地址:https://www.cnblogs.com/ggjucheng/p/2842938.html
Copyright © 2011-2022 走看看