Oracle的分析函数,对我们进行统计有很大的帮助,可以避免一些子查询等操作,在统计中,我们对开窗函数的接触较少,下面主要介绍下开窗函数的使用;
http://www.itpub.net/thread-1241311-1-1.html
http://www.oracle-base.com/articles/misc/analytic-functions.php#windowing_clause
http://blog.sina.com.cn/s/blog_70cea94b0100xi46.html
首先我们介绍下分析函数的语义
(分为range和row):缺省时相当于RANGE UNBOUNDED PRECEDING
值域窗(RANGE WINDOW) 如:RANGE N PRECEDING, 仅对数值或日期类型有效,选定窗为排序后当前行之前,某列(即排序列)值大于/小于(当 前 行该列值 –/+ N)的所有行,因此与ORDER BY子句有关系。
行窗(ROW WINDOW)如:ROWS N PRECEDING ,选定窗为当前行及之前N行。还可以加上BETWEEN AND 形式,例如RANGE BETWEEN m PRECEDING AND n FOLLOWING,表示每行对应的数据窗口是之前m行与之后n行内。
1 SELECT empno, 2 sal, 3 mgr, 4 deptno, 5 SUM(sal) over(PARTITION BY deptno ORDER BY sal RANGE BETWEEN 0 PRECEDING AND 100 FOLLOWING) dd 6 FROM emp;
其中:上面代表按DEPARTMENT_ID分区,按SALARY升序排序,汇总当前SALARY到比当前SALARY大100之间的SALARY总和。、
按DEPARTMENT_ID分区,按SALARY升序排序,汇总当前SALARY到比当前SALARY大100之间的SALARY总和。
Analytic functions are commonly used to compute cumulative, moving, centered, and reporting aggregates.
Description of the illustration analytic_function.gif
Description of the illustration analytic_clause.gif
Description of the illustration query_partition_clause.gif
Description of the illustration order_by_clause.gif
windowing_clause ::=
Description of the illustration windowing_clause.gif
上面的这张图片是开窗函数的具体语法,我们可以参照这个语法。
值的开窗,该值只能是日期和数字
我有这样一个要求:
1、查询的结果按照值排序,如sql:select value from t;
结果示例如下:
50
70
90
130
160
190
2、对数据进行分组。从上述数组第一个值开始,+50之内的值作为同一组值,如果超出50了,则开始一个新的分组。示例如下
50 50
70 50
90 50
130 130
160 130
190 190
3、最终结果是统计每组的个数。结果示例:
50 3
130 2
190 1
原帖见:http://www.itpub.net/thread-985707-1-1.html
1 WITH T AS ( 2 SELECT 50 N FROM DUAL UNION ALL 3 SELECT 70 N FROM DUAL UNION ALL 4 SELECT 90 N FROM DUAL UNION ALL 5 SELECT 130 N FROM DUAL UNION ALL 6 SELECT 160 N FROM DUAL UNION ALL 7 SELECT 190 N FROM DUAL 8 ) 9 SELECT * 10 FROM (SELECT n, 11 row_number() OVER(ORDER BY n) rn, 12 COUNT(*) OVER(ORDER BY n RANGE BETWEEN CURRENT ROW AND 50 FOLLOWING) cn 13 FROM t) 14 START WITH rn = 1 15 CONNECT BY RN = PRIOR CN + PRIOR RN;
在这里,我们通过数值开窗函数,统计了每个范围内的值,然后,通过构造条件,去进行connect by,
在这里,通过让cn和rn去相加,作为connect by的条件,这个思路非常的好,很值得我们思考
在统计的过程,我们往往只是需要去构造一个场景,条件。
我有这样一个要求:
1、查询的结果按照值排序,如sql:select value from t;
结果示例如下:
50
70
90
130
160
190
2、对数据进行分组。从上述数组第一个值开始,+50之内的值作为同一组值,如果超出50了,则开始一个新的分组。示例如下
50 50
70 50
90 50
130 130
160 130
190 190
3、最终结果是统计每组的个数。结果示例:
50 3
130 2
190 1
这样一个要求,怎么用一个sql语句实现呢。
谢谢大家!
原帖见:http://www.itpub.net/thread-985707-1-1.html
通过如下的SQL可以实现上面的要求:
1 WITH T AS ( 2 SELECT 1 N FROM DUAL UNION ALL 3 SELECT 3 N FROM DUAL UNION ALL 4 SELECT 4 N FROM DUAL UNION ALL 5 SELECT 7 N FROM DUAL UNION ALL 6 SELECT 10 N FROM DUAL UNION ALL 7 SELECT 11 N FROM DUAL UNION ALL 8 SELECT 12 N FROM DUAL UNION ALL 9 SELECT 12 N FROM DUAL UNION ALL 10 SELECT 19 N FROM DUAL UNION ALL 11 SELECT 20 N FROM DUAL 12 ) 13 SELECT T2.N 14 ,DENSE_RANK() OVER(ORDER BY T2.G) G 15 FROM ( 16 SELECT T.N 17 ,MAX(T1.N)OVER(ORDER BY T.N) G 18 FROM ( 19 SELECT N 20 FROM ( 21 SELECT N 22 ,COUNT(*) OVER(ORDER BY N RANGE BETWEEN CURRENT ROW AND 4 FOLLOWING) CNT 23 ,ROW_NUMBER()OVER(ORDER BY N) RN 24 FROM T 25 ) 26 CONNECT BY RN = PRIOR RN + PRIOR CNT 27 START WITH RN = 1 28 ) T1 , T 29 WHERE T1.N(+) = T.N 30 ) T2;
在这里,我们需要关注connect by,dense rank函数和 ,MAX(T1.N)OVER(ORDER BY T.N) G这个用法。
下面是高手用with递归解决的例子,当前也可以用我们熟悉的connect by解决该问题
1 WITH T AS 2 (SELECT 1 N 3 FROM DUAL 4 UNION ALL 5 SELECT 4 N 6 FROM DUAL 7 UNION ALL 8 SELECT 3 N 9 FROM DUAL 10 UNION ALL 11 SELECT 7 N 12 FROM DUAL 13 UNION ALL 14 SELECT 10 N 15 FROM DUAL 16 UNION ALL 17 SELECT 11 N 18 FROM DUAL 19 UNION ALL 20 SELECT 12 N 21 FROM DUAL 22 UNION ALL 23 SELECT 12 N 24 FROM DUAL 25 UNION ALL 26 SELECT 19 N 27 FROM DUAL 28 UNION ALL 29 SELECT 20 N FROM DUAL), 30 v AS 31 (SELECT n, row_number() over(ORDER BY n) rn FROM t), 32 v1(flag, 33 n, 34 rn) AS 35 (SELECT n, n, rn 36 FROM v 37 WHERE rn = 1 38 UNION ALL 39 SELECT CASE 40 WHEN v.n - v1.flag >= 5 THEN 41 v.n 42 ELSE 43 v1.flag 44 END, 45 v.n, 46 v.rn 47 FROM v, v1 48 WHERE v.rn = v1.rn + 1) 49 SELECT * FROM v1
当然也有高手用MODEL语句实现了该功能,请查看原帖。