zoukankan      html  css  js  c++  java
  • hive 用户订单行为 基础操作

    今天用hive查询用户日志表.这是日志表的格式:

    user_id,item_id,cat_id,merchant_id,brand_id,month,day,action,age_range,gender,province
    328862,323294,833,2882,2661,8,29,0,0,1,内蒙古
    328862,844400,1271,2882,2661,8,29,0,1,1,山西
    328862,575153,1271,2882,2661,8,29,0,2,1,山西
    328862,996875,1271,2882,2661,8,29,0,1,1,内蒙古
    328862,1086186,1271,1253,1049,8,29,0,0,2,浙江
    328862,623866,1271,2882,2661,8,29,0,0,2,黑龙江
    328862,542871,1467,2882,2661,8,29,0,5,2,四川
    328862,536347,1095,883,1647,8,29,0,7,1,吉林
    328862,364513,1271,2882,2661,8,29,0,1,2,贵州
    328862,575153,1271,2882,2661,8,29,0,0,0,陕西
    

      

    创建数据库名:

    create database hive;
    

      

    创建表名:

    CREATE TABLE hive.user_log(user_id INT,item_id INT,cat_id INT,merchant_id INT,brand_id INT,month STRING,day STRING,action INT,age_range INT,gender INT,province STRING) COMMENT 'Welcome to xmu dblab,Now create hive.user_log!' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE LOCATION '/user/hive/user_log/user_log';
    

      

    (1)查询10个交易记录:

    select * from user_log limit 10;
    

      

    (2)对于复杂的列名,可以使用别名:

    select merchant_id as meri from user_log;
    

      

    (3)使用嵌套语句:

    select ul.meri from (select merchant_id as meri from user_log) as ul limit 10;
    

      

    (4)统计有多少条行数据:

    select count(*) from user_log;
    

      

    (5)统计不重复的数据:

    select count(distinct user_id) from user_log;
    

      

    (6)使用group by 查询不重复的数据:

    select count(*) from (select user_id,item_id,cat_id,merchant_id,brand_id,action on from user_log group by user_id,item_id,cat_id,merchant_id,brand_id,action having count(*)=1)a;
    

      

    (7)查询某一天多少人购买了产品:

    select count(distinct user_id) from user_log where action='2' and month='11' and day='11';
    

      action=’2’ 表示支付,action=’1’表加入购物车:

    (8)查询某一天男女购买的比例:

    select count(*) from user_log where gender=0 and month='11' and day='11';
    select count(*) from user_log where gender=1 and month='11' and day='11';
    

      

    (9)查询某天某商品的购买用户,且某用户购买2次以上:

    select user_id from user_log where action='2' group by user_id having count(action='2')>1;
    

      

    (10)查询某品牌商品的浏览次数:

    select brand_id,count(action) from user_log where action='2' group by brand_id;
    

      

    参考: http://dblab.xmu.edu.cn/blog/1363-2/

    https://blog.csdn.net/cafebar123/article/details/77206889

  • 相关阅读:
    【linux基础】linux系统日志设置相关记录
    【linux基础】mount: unknown filesystem type 'exfat'
    [c++]float assign
    第6章 移动语义和enable_if:6.1 完美转发
    第5章 技巧性基础:5.7 模板模板参数
    第5章 技巧性基础:5.6 变量模板
    第5章 技巧性基础:5.4 原生数组和字符串字面量的模板
    第5章 技巧性基础:5.3 this->的使用
    第5章 技巧性基础:5.2 零初始化
    第5章 技巧性基础:5.1 关键字typename
  • 原文地址:https://www.cnblogs.com/Allen-rg/p/9270406.html
Copyright © 2011-2022 走看看