zoukankan html css js c++ java

【原创】分享一个分析函数统计案例

oracle数据仓库中提供了很多非常实用的函数，一直以来接触过，但是却没有实际的用到，因为尽管有示例，当时看懂了，但是后续马上就忘了，今天岑敏强遇到了一个使用over的场景，于是一起探讨了下，第一眼分析，我隐隐的就觉得一些分析函数应该会用到，但是没想到是over。
需求:

在 t_zxxm_fy 这个表中，有三个字段 n_lxsbh、n_fyje、n_mqzt，其中一个n_lxsbh会对应多个 n_ysje，每个n_mqzt都不一样，现在要统计几个n_lxsbh的 n_fyje之和，但每个n_lxsbh的n_ysje只能取一条记录的值，即n_ysje需要按n_mqzt进行某种排序取第一条记录

这个问题直白点说，就是将符合条件的记录按某n_lxsbh进行分组(如group by)，然后对每个分组按n_mqzt进行排序，取分组的第一条记录中的n_fyje值。显然标准sql很难一条语句搞定，我尝试了下，写出来的sql很复杂，分组，嵌套，子查询而且还没完全搞定，如执行以下sql:

select n_lxsbh, n_fyje, n_mqzt
  from t_zxxm_fy f
where n_fyje is not null --去掉n_fyje为空的值
   and n_fyje <> 0 --去掉n_fyje为0的值
   and n_lxsbh in (-999999999975037, -999999999937175, -999999999937891,
        -999999999937695, -999999999937289) --要统计的lxsbh
order by n_lxsbh, n_mqzt desc

-----

N_LXSBH                                             N_FYJE            N_MQZT
1        -999999999975037                        1680774.28        6（命中）
2        -999999999975037                        966394.92          5
3        -999999999975037                        1276123.64        5
4        -999999999975037                        1119030.00        5

5        -999999999937891                        123727.27          3（命中）

6        -999999999937695                        2000.00              1（命中）

7        -999999999937289                        66137.81            3（命中）

8        -999999999937175                        81186.10            7（命中）
9        -999999999937175                        23739.54            2
10        -999999999937175                        96500.00          2
11        -999999999937175                        8798789.00       2
12        -999999999937175                        94200.00           2
13        -999999999937175                        195914.00          2

也就是说，最后统计出的结果集如下：

           N_LXSBH                                               N_FYJE            N_MQZT
1        -999999999975037                        1680774.28            6（命中）
5        -999999999937891                        123727.27              3（命中）
6        -999999999937695                        2000.00                  1（命中）
7        -999999999937289                        66137.81                3（命中）
8        -999999999937175                        81186.10                7（命中）

下面给出实现上述结果集的sql(但不是该业务最终需要sql，最终的sql见文后)：

select *
  from (select t_zxxm_fy.n_lxsbh,
               t_zxxm_fy.n_fyje,
               row_number() over(partition by n_lxsbh order by n_mqzt desc) rn
          from (select n_lxsbh, n_fyje, n_mqzt
                  from t_zxxm_fy f
                 where n_lxsbh in (-999999999975037, -999999999937289,
                        -999999999937695,-999999999937891, -999999999937175)
                 order by n_lxsbh, n_mqzt desc) t_zxxm_fy)
where rn = 1;

解释:由于统计中不可避免进行分组，因此，首先一个子查询：

(select n_lxsbh, n_fyje, n_mqzt
                  from t_zxxm_fy f
                 where n_lxsbh in (-999999999975037, -999999999937289,
                        -999999999937695,-999999999937891, -999999999937175)
                 order by n_lxsbh, n_mqzt desc) t_zxxm_fy)

也就是最内层的查询将结果集给限制住，减少外层查询需要处理的结果集，，然后外层查询：

select t_zxxm_fy.n_lxsbh,
               t_zxxm_fy.n_fyje,
               row_number() over(partition by n_lxsbh order by n_mqzt desc) rn
          from（xxxxxxxx）

对该结果集进行over，简单说，就是上面的结果集以n_mqzt 进行排序后再以n_lxsbh进行分组(和group by类似但是也有区别，group by没组只能返回一条记录，但是这个partition却返回多条，仅仅是按某个字段分组而已，自己体会吧)，分组之后的结果集状态如第一个结果集所示，row_number函数会为分组中的每条记录加一个行号，这个类似于rownum，如我将上面的rn=1限制去掉：

select *
  from (select t_zxxm_fy.n_lxsbh,
               t_zxxm_fy.n_fyje,
               row_number() over(partition by n_lxsbh order by n_mqzt desc) rn
          from (select n_lxsbh, n_fyje, n_mqzt
                  from t_zxxm_fy f
                 where n_lxsbh in (-999999999975037, -999999999937289,
                        -999999999937695,-999999999937891, -999999999937175)
                 order by n_lxsbh, n_mqzt desc) t_zxxm_fy)

---

           N_LXSBH                                                N_FYJE               RN
1        -999999999975037                        1680774.28                 1
2        -999999999975037                        966394.92                   2
3        -999999999975037                        1276123.64                 3
4        -999999999975037                        1119030.00                 4
5        -999999999937891                        123727.27                   1
6        -999999999937695                        2000.00                       1
7        -999999999937289                        66137.81                     1
8        -999999999937175                        81186.10                     1
9        -999999999937175                        195914.00                   2
10        -999999999937175                        94200.00                   3
11        -999999999937175                        8798789.00               4
12        -999999999937175                        96500.00                   5
13        -999999999937175                        23739.54                   6

上述每个分组都给加了一个行号rn，好像group by做不到的，如果取第一行的话rn=1即可。
这样即完成了统计，功能是不是很强大？

下面就是性能了，最终的统计sql是这样写的：

select
(select sum(n_fyje)
    from (select fy.n_lxsbh,
                 first_value(fy.n_fyje) over(partition by fy.n_lxsbh order by decode(N_MQZT, 3, 1, 7, 2, 6, 3, 1, 4)) as n_fyje
            from t_zxxm_fy fy)
   where n_lxsbh in
         (select n_lxsbh from T_ZXXM_LXS t where t.n_mainxmbh = lxs.n_lxsbh)) as N_YGSSCB
  from T_ZXXM_LXS lxs
where lxs.n_lxsbh = -999999999975037;

查看全文

相关阅读:
v-for给img的src动态赋值问题
 关于vue+axios上传文件的踩坑分析
 关于nth-of-type和nth-child的关系
 关于fetch
关于移动端适配
 golang变量作用域问题-避免使用全局变量
 grpc-gateway：grpc转换为http协议对外提供服务
 google的grpc在golang中的使用
 golang中的rpc包用法
 homebrew常用命令

原文地址：https://www.cnblogs.com/zhangxsh/p/3494367.html