zoukankan      html  css  js  c++  java
  • Hive练习--蚂蚁森林习题一

    问题:

    假设2017年1月1日开始记录低碳数据(user_low_carbon),假设2017年10月1日之前满足申领条件的用户都申领了一颗p004-胡杨,剩余的能量全部用来领取“p002-沙柳” 。统计在10月1日累计申领“p002-沙柳” 排名前10的用户信息;以及他比后一名多领了几颗沙柳。

    数据的下载链接:https://files.cnblogs.com/files/yxym2016/%E8%9A%82%E8%9A%81%E6%A3%AE%E6%9E%97%E6%95%B0%E6%8D%AE.zip

    得到的统计结果如下表样式:

    user_id  plant_count less_count(比后一名多领了几颗沙柳)
    u_101    1000         100
    u_088    900          400
    u_103    500

     一、准备工作

    1、创建表

    create table user_low_carbon(user_id String,data_dt String,low_carbon int) row format delimited fields terminated by '	';
    create table plant_carbon(plant_id string,plant_name String,low_carbon int) row format delimited fields terminated by '	';

    2、加载数据

    load data local inpath "/opt/module/data/low_carbon.txt" into table user_low_carbon;
    load data local inpath "/opt/module/data/plant_carbon.txt" into table plant_carbon;

    二、解决步骤

    1、统计每个用户在2017年10月1日之前收集的总碳量,并按照碳量进行倒序排序

    select user_id  sum(low_carbon) as sum_low_carbon 
    from user_low_carbon 
    where data_dt < '2017/10/1' 
    group by user_id 
    order by sum_low_carbon desc
    limit 11;  a1  # 优化操作,过滤掉无关的数据,减少数据量

    输出结果:

    user_id    sum_low_carbon
    u_007    1470
    u_013    1430
    u_008    1240
    u_005    1100
    u_010    1080
    u_014    1060
    u_011    960
    u_009    930
    u_006    830
    u_002    659
    u_004    640
    u_003    620
    u_001    475
    u_015    290
    u_012    250

    2、每个用户的总碳量减去一颗胡杨的碳量,然后用剩余的碳量除以沙柳的碳量

    # 获取每颗胡杨所需要的碳量
    select low_carbon from plant_carbon where plant_id = 'p004'; a2  
    215
    # 获取每颗沙柳所需要的碳量 
    select low_carbon from plant_carbon where plant_id = 'p002'; a3  
    19
    
    select user_id,(a1.sum_low_carbon - a2.low_carbon)/a3.low_carbon as plant_count from a1,a2,a3;  a4

    整合一下:

    select user_id,(a1.sum_low_carbon - a2.low_carbon)/a3.low_carbon as plant_count from 
    (select user_id,sum(low_carbon) as sum_low_carbon 
    from user_low_carbon 
    where data_dt < '2017/10/1' 
    group by user_id 
    order by sum_low_carbon desc) a1,
    (select low_carbon from plant_carbon where plant_id = 'p004') a2,
    (select low_carbon from plant_carbon where plant_id = 'p002') a3; a4

    输出结果:

    user_id    plant_count
    u_007    66.05263157894737
    u_013    63.94736842105263
    u_008    53.94736842105263
    u_005    46.578947368421055
    u_010    45.526315789473685
    u_014    44.473684210526315
    u_011    39.21052631578947
    u_009    37.63157894736842
    u_006    32.36842105263158
    u_002    23.36842105263158
    u_004    22.36842105263158
    u_003    21.31578947368421
    u_001    13.68421052631579
    u_015    3.9473684210526314
    u_012    1.8421052631578947

    3、按照每个人领取的沙柳棵数进行倒序排序,并获取当前记录的下一条记录所领取的沙柳的棵数

    select user_id,plant_count,lead(plant_count,1,'') over(order by plant_count desc) next_count from a4;a5

    将a4带入上面的SQL语句:

    select user_id,plant_count,lead(plant_count,1,'0') over(order by plant_count desc) next_count from (
    select user_id,(a1.sum_low_carbon - a2.low_carbon)/a3.low_carbon as plant_count from 
    (select user_id,sum(low_carbon) as sum_low_carbon 
    from user_low_carbon 
    where data_dt < '2017/10/1' 
    group by user_id 
    order by sum_low_carbon desc) a1,
    (select low_carbon from plant_carbon where plant_id = 'p004') a2,
    (select low_carbon from plant_carbon where plant_id = 'p002') a3) a4
    limit 10;

    输出结果:

    user_id    plant_count    next_count
    u_007    66.05263157894737    63.94736842105263
    u_013    63.94736842105263    53.94736842105263
    u_008    53.94736842105263    46.578947368421055
    u_005    46.578947368421055    45.526315789473685
    u_010    45.526315789473685    44.473684210526315
    u_014    44.473684210526315    39.21052631578947
    u_011    39.21052631578947    37.63157894736842
    u_009    37.63157894736842    32.36842105263158
    u_006    32.36842105263158    23.36842105263158
    u_002    23.36842105263158    22.36842105263158
    u_004    22.36842105263158    21.31578947368421
    u_003    21.31578947368421    13.68421052631579
    u_001    13.68421052631579    3.9473684210526314
    u_015    3.9473684210526314    1.8421052631578947
    u_012    1.8421052631578947    NULL

    4、将每一条记录对应的用户领取的沙柳棵数和排在他下面的用户领取的棵数,进行相减,就是当前用户比他后一名所多的棵数

    select user_id,plant_count,(plant_count-next_count) less_count from a5;

    将a5表带入上面的sql语句:

    select user_id,plant_count,(plant_count-next_count) less_count from (
    select user_id,plant_count,lead(plant_count,1,'0') over(order by plant_count desc) next_count from (
    select user_id,(a1.sum_low_carbon - a2.low_carbon)/a3.low_carbon as plant_count from 
    (select user_id,sum(low_carbon) as sum_low_carbon 
    from user_low_carbon 
    where data_dt < '2017/10/1' 
    group by user_id 
    order by sum_low_carbon desc) a1,
    (select low_carbon from plant_carbon where plant_id = 'p004') a2,
    (select low_carbon from plant_carbon where plant_id = 'p002') a3) a4) a5;

    输出结果:

    # 输出结果
    user_id    plant_count    less_count
    u_007    66.05263157894737    2.10526315789474
    u_013    63.94736842105263    10.0
    u_008    53.94736842105263    7.368421052631575
    u_005    46.578947368421055    1.05263157894737
    u_010    45.526315789473685    1.05263157894737
    u_014    44.473684210526315    5.2631578947368425
    u_011    39.21052631578947    1.5789473684210549
    u_009    37.63157894736842    5.263157894736835
    u_006    32.36842105263158    9.000000000000004
    u_002    23.36842105263158    1.0
    u_004    22.36842105263158    1.05263157894737
    u_003    21.31578947368421    7.6315789473684195
    u_001    13.68421052631579    9.736842105263158
    u_015    3.9473684210526314    2.1052631578947367
    u_012    1.8421052631578947    1.8421052631578947

    5、使用floor函数,对进行取整操作,我们只需在获取棵数那里进行取整即可,因为那里涉及了除的操作,所以会产生小数,所以在这里进行取整即可。

    select user_id,plant_count,(plant_count-next_count) less_count from (
    select user_id,plant_count,lead(plant_count,1,'0') over(order by plant_count desc) next_count from (
    select user_id,floor((a1.sum_low_carbon - a2.low_carbon)/a3.low_carbon) as plant_count from 
    (select user_id,sum(low_carbon) as sum_low_carbon 
    from user_low_carbon 
    where data_dt < '2017/10/1' 
    group by user_id 
    order by sum_low_carbon desc) a1,
    (select low_carbon from plant_carbon where plant_id = 'p004') a2,
    (select low_carbon from plant_carbon where plant_id = 'p002') a3) a4) a5
    limit 10;

    输出结果:

    user_id    plant_count    less_count
    u_007    66    3
    u_013    63    10
    u_008    53    7
    u_005    46    1
    u_010    45    1
    u_014    44    5
    u_011    39    2
    u_009    37    5
    u_006    32    9
    u_002    23    1
  • 相关阅读:
    C++ Builder编写WinForm Post数据至Web服务器并得到返回数据
    东方网点网吧电影下载url
    自己写的常用jquery扩展函数
    给IT新人的15点建议:苦逼程序员的辛酸反省与总结
    hdu 2152 Fruit (母函数)
    母函数
    hdu 1709 The Balance (母函数)
    hdu 1171 Big Event in HDU (母函数)
    hdu 1085 Holding BinLaden Captive! (母函数)
    hdu 1398 Square Coins (母函数)
  • 原文地址:https://www.cnblogs.com/yxym2016/p/13254316.html
Copyright © 2011-2022 走看看