0. 说明
Hive 的高级聚合函数 union all | grouping sets | cube | rollup
pv //page view 页面访问量
uv //user view 访问人数
1. union all
表联合操作
1.0 准备数据
pv.txt
2015-03 2015-03-10 cookie1 2015-03 2015-03-10 cookie5 2015-03 2015-03-12 cookie7 2015-04 2015-04-12 cookie3 2015-04 2015-04-13 cookie2 2015-04 2015-04-13 cookie4 2015-04 2015-04-16 cookie4 2015-03 2015-03-10 cookie2 2015-03 2015-03-10 cookie3 2015-04 2015-04-12 cookie5 2015-04 2015-04-13 cookie6 2015-04 2015-04-15 cookie3 2015-04 2015-04-15 cookie2 2015-04 2015-04-16 cookie1 2015-02 2015-02-16 cookie2 2015-02 2015-02-16 cookie3
1.1 建表
create table uv(month string,day string, id string) row format delimited fields terminated by ' ';
1.2 加载数据
load data local inpath '/home/centos/files/pv.txt' into table uv;
1.3 设置本地模式
SET hive.exec.mode.local.auto=true;
1.4 统计每月用户访问量
select month ,count(distinct id) from uv group by month;
1.5 统计每天用户访问量
select day ,count(distinct id) from uv group by day;
1.6 联合查询每月每天用户访问量
select month ,count(distinct id) from uv group by month union all select day ,count(distinct id) from uv group by day;
1.7 使用分组集(grouping sets)实现查询
select month, day,count(distinct id), grouping__id from uv group by month,day grouping sets(month,day);
grouping__id //分组的组号
month
day
1.8 使用 cube 进行查询
select month, day,count(distinct id), grouping__id from uv group by month,day with cube order by grouping__id;
null
year
month
day
year,month
year,day
month,day
year,momth,day
1.9 使用 rollup 进行查询
select month, day,count(distinct id), grouping__id from uv group by month,day with rollup order by grouping__id;
null
year
year month
year month day