zoukankan      html  css  js  c++  java
  • [Hive_11] Hive 的高级聚合函数


    0. 说明

      Hive 的高级聚合函数 union all | grouping sets | cube | rollup 

      pv //page view 页面访问量
      uv //user view 访问人数


    1. union all

      表联合操作

      1.0 准备数据

      pv.txt

    2015-03    2015-03-10    cookie1
    2015-03    2015-03-10    cookie5
    2015-03    2015-03-12    cookie7
    2015-04    2015-04-12    cookie3
    2015-04    2015-04-13    cookie2
    2015-04    2015-04-13    cookie4
    2015-04    2015-04-16    cookie4
    2015-03    2015-03-10    cookie2
    2015-03    2015-03-10    cookie3
    2015-04    2015-04-12    cookie5
    2015-04    2015-04-13    cookie6
    2015-04    2015-04-15    cookie3
    2015-04    2015-04-15    cookie2
    2015-04    2015-04-16    cookie1
    2015-02    2015-02-16    cookie2
    2015-02    2015-02-16    cookie3

      1.1 建表

    create table uv(month string,day string, id string) row format delimited fields terminated by '	';

      1.2 加载数据

    load data local inpath '/home/centos/files/pv.txt' into table uv;

      1.3 设置本地模式

        SET hive.exec.mode.local.auto=true;

      1.4 统计每月用户访问量

    select month ,count(distinct id) from uv group by month;

      1.5 统计每天用户访问量

    select day ,count(distinct id) from uv group by day;

      1.6 联合查询每月每天用户访问量

    select month ,count(distinct id) from uv group by month union all select day ,count(distinct id) from uv group by day;

       

      1.7 使用分组集(grouping sets)实现查询

    select month, day,count(distinct id), grouping__id from uv group by month,day grouping sets(month,day);

      grouping__id //分组的组号

      

      month

      day

      1.8 使用 cube 进行查询

    select month, day,count(distinct id), grouping__id from uv group by month,day with cube order by grouping__id;

       

      null
      year
      month
      day
      year,month
      year,day
      month,day
      year,momth,day

      1.9 使用 rollup 进行查询

    select month, day,count(distinct id), grouping__id from uv group by month,day with rollup order by grouping__id;

      

      null
      year
      year month
      year month day


  • 相关阅读:
    python学习笔记(十三)接口开发
    python学习笔记(十二)python操作redis
    python学习笔记(十一)redis的介绍及安装
    python学习笔记(十)完善数据库操作
    python学习笔记(九)函数返回多个值,列表生成式,循环多个变量,入参格式声明
    Jenkins+Ant+Jmeter自动化测试平台
    python学习笔记(八)python操作Excel
    python学习笔记(七)操作mysql
    python学习笔记(六)time、datetime、hashlib模块
    smtplib与email模块(实现邮件的发送)
  • 原文地址:https://www.cnblogs.com/share23/p/10319921.html
Copyright © 2011-2022 走看看