zoukankan      html  css  js  c++  java
  • Hive 练习 简单任务处理

    1、2018年4月份的用户数、订单量、销量、GMV (不局限与这些统计量,你也可以自己想一些)

    -- -- -- 2018年4月份的用户数量
    select
    	count(a.user_id) as user_nums
    from
    	(
    		select
    			user_id
    		from
    			app_jypt_m04_ord_det_di
    		where
    			dt >= '2018-04-01'
    			and sale_ord_dt <= '2018-04-30'
    			and sale_ord_dt >= '2018-04-01'
    		group by
    			user_id
    	)
    	a;
    -- 2018年4月份的订单量
    select
    	count(a.sale_ord_id) as sale_nums
    from
    	(
    		select
    			sale_ord_id
    		from
    			app_jypt_m04_ord_det_di
    		where
    			dt >= '2018-04-01'
    			and sale_ord_dt <= '2018-04-30'
    			and sale_ord_dt >= '2018-04-01'
    		group by
    			sale_ord_id
    	)
    	a;
    -- -- 2018年4月份的销量
    select
    	sum(COALESCE(sale_qtty, 0)) as xiaoliang
    from
    	app_jypt_m04_ord_det_di
    where
    	dt >= '2018-04-01'
    	and sale_ord_dt <= '2018-04-30'
    	and sale_ord_dt >= '2018-04-01';
    -- -- -- 2018年4月份的销售额GMV
    -- user_payable_pay_amount 用户应付金额
    select
    	sum(user_payable_pay_amount) as xiaoshoujine
    from
    	app_jypt_m04_ord_det_di
    where
    	dt >= '2018-04-01'
    	and sale_ord_dt <= '2018-04-30'
    	and sale_ord_dt >= '2018-04-01';
    

      

    PS: 

    • 订单数就是卖了几单 ;
    • 销量就是卖了多少件,一个订单中可能卖出一件或多件;
    • GMV: Gross Merchandise Volume,是成交总额(一定时间段内)的意思。
    • 在电商网站定义里面是网站成交金额。这个实际指的是拍下订单金额, 包含付款和未付款的部分。

    2、上述这些变化量相对3月份的变化


    3、计算2018年4月1号的新用户数量(之前半年未购买的用户为新用户)

    -- 计算2018年4月1号的新用户数量(之前半年未购买的用户为新用户)
    -- 首先找出4月1号的用户的xxx,然后统计半年内有过购买记录的用户yyy。
    
    
    -- select distinct user_id as xxx from gdm_m04_ord_det_sum where dt>='2018-04-01' and sale_ord_dt='2018-04-01';
    -- select distinct user_id as yyy from gdm_m04_ord_det_sum where dt>='2017-10-01' and sale_ord_dt<='2018-03-31' and sale_ord_dt>='2017-10-01';
    
    -- 用xxx-yyy,然后count()计算数量;
    
    
    
    -- 两种方法,一种用not in ,一种用not exists
    
    -- not in 方法
    select distinct user_id from gdm_m04_ord_det_sum 
    where user_id not in (select distinct user_id from gdm_m04_ord_det_sum where dt>='2017-10-01' and sale_ord_dt<='2018-03-31' and sale_ord_dt>='2017-10-01');
    
    
    
    -- not exists 方法
    select distinct user_id from gdm_m04_ord_det_sum where dt>='2018-04-01' and sale_ord_dt='2018-04-01' where not exists (select distinct user_id from gdm_m04_ord_det_sum where dt>='2017-10-01' and sale_ord_dt<='2018-03-31' and sale_ord_dt>='2017-10-01' where gdm_m04_ord_det_sum.user_id=gdm_m04_ord_det_sum.user_id);
    
    
    
    -- 另一种 left outer join 这样效率更高 语法有问题??
    
    select distinct user_id from gdm_m04_ord_det_sum where dt>='2018-04-01' and sale_ord_dt='2018-04-01' a left outer join (select distinct user_id from gdm_m04_ord_det_sum where dt>='2017-10-01' and sale_ord_dt<='2018-03-31' and sale_ord_dt>='2017-10-01' b) on a.user_id=b.user_id where b.user_id is null;
    

     

    正确方法:

    select
    	count(a.id1) as user_new_nums
    from
    	(
    		select distinct
    			user_id as id1
    		from
    			app_jypt_m04_ord_det_di
    		where
    			dt >= '2018-04-01'
    			and sale_ord_dt = '2018-04-01'
    	)
    	a
    left outer join
    	(
    		select distinct
    			user_id as id2
    		from
    			app_jypt_m04_ord_det_di
    		where
    			dt >= '2017-10-01'
    			and sale_ord_dt <= '2018-03-31'
    			and sale_ord_dt >= '2017-10-01'
    	)
    	b
    on
    	a.id1 = b.id2
    where
    	b.id2 is null;
    

      

  • 相关阅读:
    数据库 | 建表常用语句
    心得 | 撰写项目申报书
    工具 | 时间转化
    SpringBoot | 启动异常 | 显示bulid success 无 error信息
    120. 三角形最小路径和
    63. 不同路径 II
    SpringBoot | Velocity template
    SpringBoot | quartz | @DisallowConcurrentExecution
    SpringBoot | Hibernate @Transient 注解
    Java | 基础归纳 | 静态方法与实例方法的区别
  • 原文地址:https://www.cnblogs.com/Allen-rg/p/9283175.html
Copyright © 2011-2022 走看看