zoukankan      html  css  js  c++  java
  • Hive:有表A与表B进行inner join,如果A分组内包含有数据,使用A,否则使用B分组下的数据

    tommyduan_fingerlib 指纹库 栅格小区级别数据
    tommyduan_mr_grid_cell_result_all 统计 栅格小区级别数据
    业务:
    以tommyduan_mr_grid_cell_result_all为主,如果某个栅格(gridid,buildingid,floor)没有小区的话,使用用指纹库的栅格(gridid,buildingid,floor)下的小区;
    否则,使用tommyduan_mr_grid_cell_result_all的栅格(gridid,buildingid,floor)下的小区填充。

    数据示例:

    --指纹库
    --gridid1,buildlingid1,floor1,cell1
    --gridid1,buildlingid1,floor1,cell2
    --gridid1,buildlingid1,floor1,cell3
    
    --gridid2,buildlingid1,floor1,cell31
    --gridid2,buildlingid1,floor1,cell298
    
    --统计结果
    --gridid1,buildlingid1,floor1,cell2222
    --gridid1,buildlingid1,floor1,cell3333
    
    --合并后的结果:
    --gridid1,buildlingid1,floor1,cell2222
    --gridid1,buildlingid1,floor1,cell3333
    --gridid2,buildlingid1,floor1,cell31
    --gridid2,buildlingid1,floor1,cell298

    实现思路:

    首先,统计出每个分组属于那个表。

    create table tommyduan_gridcell_group as
    select t10.gridid,t10.buildingid,t10.floor,(case when isnull(t11.buildingid) then 'fingerlib' else 'mr_grid_cell' end) as datafrom
    from (select gridid,buildingid,floor from tommyduan_fingerlib group by gridid,buildingid,floor) t10
    left outer join
    (select gridid,buildingid,floor    from tommyduan_mr_grid_cell_result_all group by gridid,buildingid,floor) t11
    on t10.gridid=t11.gridid and t10.buildingid=t11.buildingid and t10.floor=t11.floor;

    其次,根据数据分组所在的表去那个表关联出数据。

    select t10.gridid,t10.objectid,t10.longitude,t10.latitude,t10.gridx,t10.gridy,
        t10.floor,t10.avgrsrp,t10.total_num,t10.mr_weak_num,
        t10.avgrsrq,t10.avgsinrul,
        t10.sinrul_total_num,t10.sinrul_low_num,t10.buildingid
    from tommyduan_fingerlib t10 
    inner join (select * from tommyduan_gridcell_group where datafrom='fingerlib') t11 on t10.gridid=t11.gridid and t10.buildingid=t11.buildingid and t10.floor=t11.floor
    union all
    select t10.gridid,t10.objectid,t10.longitude,t10.latitude,t10.gridx,t10.gridy,
        t10.floor,t10.avgrsrp,t10.total_num,t10.mr_weak_num,
        t10.avgrsrq,t10.avgsinrul,
        t10.sinrul_total_num,t10.sinrul_low_num,t10.buildingid
    from tommyduan_mr_grid_cell_result_all t10 
    inner join (select * from tommyduan_gridcell_group where datafrom='mr_grid_cell') t11 on t10.gridid=t11.gridid and t10.buildingid=t11.buildingid and t10.floor=t11.floor

     需要注意事项:

    1)如果inner join 关联条件中包含了buildingid或者gridid或者floor有null的数据,虽然两边都是null的条件下,也是无法关联出来的;

    2)针对buildingid如果两边都是null,关联时依然需要关联出来的解决方案请参考:《Hive&SqlServerql:inner join on条件中如果两边都是空值的情况下,关联结果中会把数据给过滤掉

  • 相关阅读:
    计算机二进制总结
    java-集合排序,队列,散列表map以及如何遍历
    java-Collection,List简单使用与方法/(集合使用-中)
    java-Date类与集合(上)
    java-正则、object中的两个方法的使用
    java-注释、API之字符串(String)
    Java-面向对象三大特征、设计规则
    java-多态、内部类
    java-修饰词、抽象类、抽象方法
    java-重载、包修饰词以及堆栈管理
  • 原文地址:https://www.cnblogs.com/yy3b2007com/p/8283957.html
Copyright © 2011-2022 走看看