zoukankan      html  css  js  c++  java
  • SQL判断分段的连续值

    有 时候我们需要找出在一组连续的号码中,有哪些是缺失的部分。具体的需求又分为两种情况

    1. 查找 连续部分的最大和最小值
    2. 查找缺少的部分
    经过构造的数据如下
    -- preparation
    drop table test1 purge;
    create table test1(id int,dt date);

    insert into test1
    (id, dt)
    select rownum as id, trunc(add_months(sysdate, rownum), 'mm') as dt
    from dual
    connect by rownum <= 20;
    delete test1 where id in (4,7,13,19);
    commit;

    SQL> select * from test1;

    ID DT
    ---- -----------
    1 2009-4-1
    2 2009-5-1
    3 2009-6-1
    5 2009-8-1
    6 2009-9-1
    8 2009-11-1
    9 2009-12-1
    10 2010-1-1
    11 2010-2-1
    12 2010-3-1
    14 2010-5-1
    15 2010-6-1
    16 2010-7-1
    17 2010-8-1
    18 2010-9-1
    20 2010-11-1

    16 rows selected

    查找 连续部分的最大和最小值

    下面的场景是连续部分的起始和终点,使用NUMBER类型
    -- case 1 number type + rownum
    select min(v2.id) as n1, max(v2.id) as n2
    from (select v1.id, (v1.id - rownum) as delta
    from (select id from test1 order by id) v1) v2
    group by v2.delta
    order by 1;

    首先需要对取出的id值进行排序
    用这个id值和rownum相减,如果差值是相同的,说明这些id是连续的


    N1 N2
    ---------- ----------
    1 3
    5 6
    8 12
    14 18
    20 20

    然后使用分析函数实现 同样的功能。
    -- case 2 number type + analyse func
    with v as
    (select id,
    lag(id) over(order by id) as lag,
    lead(id) over(order by id) as lead
    from test1
    order by id)
    select v1.id, v2.id
    from (select v.id, rownum as rn
    from v
    where (v.id - v.lag) != 1
    or v.lag is null) v1,
    (select v.id, rownum as rn
    from v
    where (v.lead - v.id) != 1
    or v.lead is null) v2
    where v1.rn = v2.rn;

    这里使用WITH构造一个视图,避免了对表的多次扫描
    如果一个ID值和它的上一个值之间的绝对值差不是1,表示发生了跳跃。
    这个ID是某个连续区间的最小或者最大

    相同的需求,不同的是这里使用日期类型。注意,因为构造的数据是以MONTH为单位的,所以这里需要使用months_between()来计算差值。
    -- case 3 date type + rownum
    select min(v2.dt) as n1, max(v2.dt) as n2
    from (select v1.dt,
    months_between(v1.dt, trunc(add_months(sysdate, rownum), 'mm')) as delta
    from (select dt from test1 order by id) v1) v2
    group by v2.delta
    order by 1;

    N1 N2
    ----------- -----------
    2009-4-1 2009-6-1
    2009-8-1 2009-9-1
    2009-11-1 2010-3-1
    2010-5-1 2010-9-1
    2010-11-1 2010-11-1


    查找缺少的部分

    查找缺少的部分,基本思路是在找到连续部分的基础上,构造缺少的值
    -- case 4 find lost value
    with v as
    (select min(v2.id) as n1,
    max(v2.id) as n2,
    lead(min(v2.id)) over(order by min(v2.id)) as n3
    from (select v1.id, (v1.id - rownum) as delta
    from (select id from test1 order by id) v1) v2
    group by v2.delta
    order by 1)
    select t1.sid + t2.rn as lost
    from (select v.n2 as sid, v.n3 as eid from v) t1,
    (select rownum rn from dual connect by rownum <= 50) t2
    where t1.sid + t2.rn < t1.eid;

    使用case1的算法得到连续部分的最大n2,最小值n1
    使用lead函数把某个最大值n2和下一个最小值n3 放到同一行
    使用rownum构造连续的缺少部分


    LOST
    ----------
    4
    7
    13
    19

    另一种查找缺少的方 法,使用外连接
    -- case 5 find lost value
    with v as
    (select min(id) as n1, max(id) as n2 from test1)
    select t2.rn as lost
    from v, test1 t1, (select rownum rn from dual connect by rownum <= 50) t2
    where t2.rn between v.n1 and v.n2
    and t1.id(+) = t2.rn
    and t1.id is null
    order by t2.rn;

    取得表的最大n2和最小n1,这里是全部数据的集合
    构造大表,并且只取得最大最小之间的rownum
    使用外连接取得缺少的值
    注意:和case4相比,这个方法要扫描两次表

    Note: 更复杂的情况,可以参考Partition Outter Join




  • 相关阅读:
    [BZOJ1657] [Usaco2006 Mar] Mooo 奶牛的歌声 (单调栈)
    [BZOJ1016] [JSOI2008] 最小生成树计数 (Kruskal)
    [BZOJ1015] [JSOI2008] 星球大战starwar (并查集)
    [BZOJ1007] [HNOI2008] 水平可见直线 (凸包)
    [BZOJ1061] [Noi2008] 志愿者招募 (费用流)
    [BZOJ1051] [HAOI2006] 受欢迎的牛 (强联通分量)
    BZOJ2299: [HAOI2011]向量
    BZOJ2783: [JLOI2012]树
    BZOJ3521: [Poi2014]Salad Bar
    BZOJ2429: [HAOI2006]聪明的猴子
  • 原文地址:https://www.cnblogs.com/wait4friend/p/2334552.html
Copyright © 2011-2022 走看看