zoukankan      html  css  js  c++  java
  • 数据核对与验证标准(一)

    所有表从大方向上有

    1.表的验证:指标临时表的验证和 指标合并表的验证

    2.数据验证:数据总量的验证,数据质量的验证(连续指标和离散指标)

    1.临时表的总量的验证
    show partitions app.xxx_t_xxx;
    select * from app.xxx_t_xxx where batch_date='批次日期' limit100;
    select batch_date ,count(1) from app.xxx_t_xxx group by batch_date; 
    
    2.合并表的总量验证
    show partitions app.xxx_r_xxx;
    select * from app.xxx_r_xxx where batch_date='批次日期' limit100;
    
    3.中间表各个指标的统计
    a)非空填充率,最大值,最小值
    select 
        batch_date
        ,指标
        ,count(1)
        ,sum(if trim(指标)<>'' and 指标 is not null,1,0) 
        ,max(指标)
        ,min(指标)
    from app.xxx_t_xxx app.xxx_t_xxx group by batch_date,指标 order by batch_date,指标 ;
    
    b)枚举类型的各个类型的数量分布
    select batch_date,指标,count(1) from app.xxx_r_xxx group by batch_date ,指标 order by batch_date,指标;
    
    
    4.合并表各个指标的统计
    
    a)非空填充率,最大值,最小值
    select 
         指标
        ,count(1)
        ,sum(if trim(指标)<>'' and 指标 is not null,1,0) 
        ,max(指标)
        ,min(指标)
    from app.xxx_t_xxx app.xxx_t_xxx group by 指标 order by 指标 ;
    
    b)枚举类型的各个类型的数量分布
    select 指标,count(1) from app.xxx_r_xxx group by 指标 order by 指标;
  • 相关阅读:
    洛谷 P1919 【模板】A*B Problem升级版(FFT快速傅里叶)
    Codeforces Goodbye 2018
    ubuntu 百度云
    【UOJ 351】新年的叶子
    【SDOI2008】仪仗队
    NOI 2002 贪吃的九头龙
    最大获利
    codeforces 814E An unavoidable detour for home
    codeforces 814D An overnight dance in discotheque
    bzoj3191 [JLOI2013]卡牌游戏
  • 原文地址:https://www.cnblogs.com/wqbin/p/11275008.html
Copyright © 2011-2022 走看看