zoukankan      html  css  js  c++  java
  • hive 全局排序

    不分发数据,使用单个reducer

    set mapred.reduce.tasks=1;
    
    select * 
    from dw.dw_app 
    where 
    dt>='2016-09-01' 
    and dt <='2016-09-18' 
    order by stime
    limit 30000;

    包多一层,是用order by

    select t.* from 
    (
    select *
    from dw.dw_app 
    where 
    dt>='2016-09-01' 
    and dt <='2016-09-18' 
    and app_id='16099'
    and msgtype = 'role.recharge' 
    ) t
    order by t.stime 
    limit 5000;

    把所有具有相同的行最终都在一个reducer分区中,在在一个reducer中排序。 cluster by column=distribute by column+sort by colum

    select * 
    from dw.dw_app 
    where 
    dt>='2016-09-01' 
    and dt <='2016-09-18' 
    and app_id='16099'
    and msgtype = 'role.recharge' 
    cluster by dt
    limit 30000;

    查询每天前十名充值用户和充值总额

    select t3.*
      from (select t2.*
              from (select dt,
                           account_id,
                           sum(recharge_money) as total_money,
                           row_number() over(partition by dt order by sum(recharge_money) desc) rank
                      from (select dt, account_id, recharge_money
                              from dw.dw_app
                             where dt >= '2016-09-01'
                               and dt <= '2016-09-18'
                               and app_id = '16099'
                               and msgtype = 'role.recharge' 
                    cluster by dt, account_id) t group by dt, account_id) t2
    where t2.rank <= 10) t3 order by t3.dt asc, rank asc limit 300;
  • 相关阅读:
    SpringMVC常用注解
    SpringMVC基础知识
    如何在git中恢复先前的提交?
    git pull 和git fetch的区别
    webpack和gulp的比较
    SpringMVC框架
    Spring框架
    为什么要在一个团队中开展软件测试工作?
    需求测试的注意事项有哪些?
    主键、外键的作用,索引的优点与不足?
  • 原文地址:https://www.cnblogs.com/linn/p/5941406.html
Copyright © 2011-2022 走看看