zoukankan      html  css  js  c++  java
  • MySQL 与 Hive 逻辑相关

    MYSQL:

    1. 每一个逗号, 相当于一个中间表, 而且hive的groupby 更为严格 ;

    -> 计算占比
    SELECT branch_company, ROUND(SUM(pass_flag) / (SELECT SUM(pass_flag) FROM wr_test),2) rate FROM wr_test GROUP BY branch_company

    -> 计算排名
    SELECT FIELD, t1.field2, (SELECT DISTINCT(COUNT(field2)) FROM `test` t2 WHERE t2.field2 >= t1.field2 ) rank FROM `test` t1


    Hive:

    -> 不能在逗号后面直接使用() 而必须使用left join 出总值:
    select round( t.s / t2.total , 3) , "" order_city_name, "" dealer_shortname , t.* from
    (
    select branch_company, time_date, time_unit, sum(pass_flag) s from bi_middle.wr_middle_1
    group by branch_company, time_date, time_unit
    ) t
    left join
    (select sum(pass_flag) total from bi_middle.wr_middle_1) t2


    -> 使用grouping sets 代替多个union;

    select t1.*,t1.pass_flag/t2.total as ratio from (
    select nvl(branch_company,'全国') branch_company,order_city_name,dealer_shortname,a.time_date,a.time_unit,sum(pass_flag) as pass_flag from bi_middle.wr_middle_1 a
    group by branch_company,order_city_name,dealer_shortname,a.time_date,a.time_unit
    grouping sets((a.time_date,a.time_unit),(branch_company,a.time_date,a.time_unit),(branch_company,order_city_name,a.time_date,a.time_unit),(branch_company,order_city_name,dealer_shortname,a.time_date,a.time_unit))
    ) t1
    left join (
    select sum(pass_flag) as total from bi_middle.wr_middle_1
    ) t2

    # 注意事项:

    -> Spark-SQL使用JDBC读取表, 默认使用一个executor去读, 如果转成并发, 需要设置分区索引去读;

    1. 创建parquet存储表

    CREATE TABLE IF NOT EXISTS bi_orginal.`param_code_dcs` (
    `CODE_ID` string COMMENT 'CODE ID',
    `TYPE` string COMMENT 'CODE类型',
    `TYPE_NAME` string COMMENT 'CODE类型中文说明',
    `CODE_DESC` string COMMENT 'CODE ID中文说明',
    `NUM` string ,
    `update_time` string
    )
    comment 'param_code_dcs'
    STORED AS parquet

    2.  向表中插入数据


    insert overwrite table bi_orginal.`param_code_dcs`
    SELECT CODE_ID, TYPE, TYPE_NAME, CODE_DESC, NUM, from_unixtime(unix_timestamp(),'yyyy-MM-dd HH:mm:ss') FROM artemisdcs.d_f_artemisdcs_tc_code_dcs

    3.  Hive中的union

    SELECT * from (
    SELECT sso_id from bi_orginal.som_so_user
    UNION ALL
    SELECT sso_id from bi_orginal.som_so_user
    )t LIMIT 5

    4. Hive删表:

    truncate table table_name         先删除表中数据

    drop table table_name               后删除HDFS上数据表

  • 相关阅读:
    ubuntu下安装maven
    159.Longest Substring with At Most Two Distinct Characters
    156.Binary Tree Upside Down
    155.Min Stack
    154.Find Minimum in Rotated Sorted Array II
    153.Find Minimum in Rotated Sorted Array
    152.Maximum Product Subarray
    151.Reverse Words in a String
    150.Evaluate Reverse Polish Notation
    149.Max Points on a Line
  • 原文地址:https://www.cnblogs.com/ruili07/p/10825767.html
Copyright © 2011-2022 走看看