zoukankan      html  css  js  c++  java
  • 笔记:Oracle SQL 高级查询简介 (2) 分析函数

    1、评级函数

    (1). 排序rank()、dense_rank()

    遇到重复的,rank()下一个加2,dense_rank() 下一个加1  

    select s.prod_id, sum(s.amount_sold),
    rank() over (order by sum(s.amount_sold) desc) as rank,
    dense_rank() over (order by sum(s.amount_sold) desc) as dense_rank
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.prod_id
    order by s.prod_id; 


    结果:

       PROD_ID SUM(S.AMOUNT_SOLD)       RANK DENSE_RANK
    ---------- ------------------ ---------- ----------
            13          936197.53          7          7
            14         2733887.43          2          2
            15         1368317.88          5          5
            16              11.99         60         60
            17         2239127.88          3          3
            18         5477218.04          1          1
            19          182670.35         20         20
            20          990525.95          6          6
            21         1535187.44          4          4
            22           31853.11         54         54
            23           85211.28         36         36
            24          163929.27         22         22
            25          522713.71         13         13
            26          567533.83         12         12
            27          107968.24         30         30
            28          644480.02          9          9
            29          578374.62         11         11
            30            59391.8         48         48
            31           64464.83         45         45
            32           124081.8         26         26


    还可以通过 NULLS  LAST 或 NULLS  FIRST  控制null放在首位或末位

    select s.prod_id, sum(s.amount_sold),
    rank() over (order by sum(s.amount_sold) desc nulls last) as rank,
    dense_rank() over (order by sum(s.amount_sold) desc nulls last) as dense_rank
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.prod_id
    order by s.prod_id; 
    


    与PARTITION BY 子句结合使用 

    select s.prod_id, s.month_id, sum(s.amount_sold),
    rank() over (partition by s.month_id order by sum(s.amount_sold) desc) as rank,
    dense_rank() over (partition by s.month_id order by sum(s.amount_sold) desc) as dense_rank
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.prod_id, month_id
    order by s.prod_id, month_id; 


    结果:

       	PROD_ID	MONTH_ID	SUM(S.AMOUNT_SOLD)	RANK	DENSE_RANK
    1	13	01	125575.64	6	6
    2	13	02	122325.21	5	5
    3	13	03	61649.5	6	6
    4	13	04	17404.26	16	16
    5	13	05	61649.5	        6	6
    6	13	06	20004	        16	16
    7	13	07	164719.38	4	4
    8	13	08	125010.33	5	5
    9	13	10	112205.27	6	6
    10	13	12	125654.44	5	5
    11	14	01	239773.24	3	3
    12	14	02	278879.97	2	2
    


    rank()、dense_rank() 等同样可以与rollup、cube、grouping sets 等函数合用


    (2). CUME_DIST 和 PERCENT_RANK 函数

    cume_dist 计算某个值相对于一组值中的位置,即 cumulative distribution (累积分布)。

    percent_rank 某个值相对于一组值的百分比

    select s.prod_id, sum(s.amount_sold),
    cume_dist() over (  order by sum(s.amount_sold) desc) as cume_dist,
    percent_rank() over (  order by sum(s.amount_sold) desc) as percent_rank
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.prod_id
    order by s.prod_id; 


    结果:

       	PROD_ID	SUM(S.AMOUNT_SOLD)	CUME_DIST	PERCENT_RANK
    1	13	936197.53	0.116666666666667	0.101694915254237
    2	14	2733887.43	0.0333333333333333	0.0169491525423729
    3	15	1368317.88	0.0833333333333333	0.0677966101694915
    4	16	11.99	        1	                1
    5	17	2239127.88	0.05	                0.0338983050847458
    6	18	5477218.04	0.0166666666666667	0
    7	19	182670.35	0.333333333333333	0.322033898305085
    8	20	990525.95	0.1	                0.0847457627118644
    9	21	1535187.44	0.0666666666666667	0.0508474576271186
    10	22	31853.11	0.9	                0.898305084745763
    11	23	85211.28	0.6	                0.593220338983051
    12	24	163929.27	0.366666666666667	0.355932203389831
    13	25	522713.71	0.216666666666667	0.203389830508475
    14	26	567533.83	0.2	                0.186440677966102
    15	27	107968.24	0.5	                0.491525423728814
    16	28	644480.02	0.15	                0.135593220338983
    17	29	578374.62	0.183333333333333	0.169491525423729
    18	30	59391.8	        0.8	                0.796610169491525
    19	31	64464.83	0.75	                0.745762711864407
    20	32	124081.8	0.433333333333333	0.423728813559322
    21	33	110987.48	0.483333333333333	0.474576271186441
    22	34	106525.01	0.516666666666667	0.508474576271186
    23	35	269009.61	0.283333333333333	0.271186440677966
    24	36	131170.12	0.416666666666667	0.406779661016949
    25	37	293152.28	0.266666666666667	0.254237288135593
    26	38	61209.1	        0.783333333333333	0.779661016949152
    27	39	149108.82	0.4	                0.389830508474576
    28	40	412274.43	0.25	               0.23728813559322 
    29	41	101928.66	0.533333333333333	0.525423728813559
    30	42	96450.49	0.566666666666667	0.559322033898305
    31	43	63514.58	0.766666666666667	0.76271186440678
    32	44	54659.47	0.833333333333333	0.830508474576271
    33	45	122265.54	0.45	                0.440677966101695
    34	46	73544.8    	0.65	                0.644067796610169
    35	47	70288.37	0.716666666666667	0.711864406779661
    36	48	69656.88	0.733333333333333	0.728813559322034
    37	113	91540.88	0.583333333333333	0.576271186440678
    38	114	71070.25	0.683333333333333	0.677966101694915
    39	115	10505.43	0.933333333333333	0.932203389830508
    40	116	56622.64	0.816666666666667	0.813559322033898            
    41	117	40338.17	0.85            	0.847457627118644
    42	118	226875.98	0.3	                0.288135593220339
    43	119	39241.12	0.866666666666667 	0.864406779661017
    44	120	27333.37	0.916666666666667	0.915254237288135
    45	123	163865.43	0.383333333333333	0.372881355932203
    46	124	624.82          0.983333333333333	0.983050847457627
    47	125	117214.73	0.466666666666667	0.457627118644068
    48	126	172907.76	0.35            	0.338983050847458
    49	127	611329.86	0.166666666666667	0.152542372881356
    50	128	221494.36	0.316666666666667	0.305084745762712
    51	129	496483.76	0.233333333333333	0.220338983050847
    52	130	691798.97	0.133333333333333	0.11864406779661
    53	131	70645.28	0.7	                0.694915254237288
    54	132	83353.36	0.616666666666667	0.610169491525424
    55	133	76010        	0.633333333333333	0.627118644067797
    56	136	8989.83         0.95            	0.949152542372881
    57	140	97259.84	0.55	                0.542372881355932
    58	146	35771.66	0.883333333333333	0.88135593220339
    59	147	1685.89	        0.966666666666667	0.966101694915254
    60	148	72058.92	0.666666666666667	0.661016949152542
    


    (3) 分片NTILE

    ntile(n), 划分为n个分片,查找属于哪个分片

    select s.prod_id, sum(s.amount_sold),
    ntile(4) over (  order by sum(s.amount_sold) desc) as ntile
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.prod_id
    order by s.prod_id; 


    结果:

       	PROD_ID	SUM(S.AMOUNT_SOLD)	NTILE
    1	13	936197.53	1
    2	14	2733887.43	1
    3	15	1368317.88	1
    4	16	11.99	        4
    5	17	2239127.88	1
    6	18	5477218.04	1
    7	19	182670.35	2
    8	20	990525.95	1
    9	21	1535187.44	1
    10	22	31853.11	4
    11	23	85211.28	3
    12	24	163929.27	2
    13	25	522713.71	1
    14	26	567533.83	1
    15	27	107968.24	2
    16	28	644480.02	1
    17	29	578374.62	1
    18	30	59391.8	        4
    19	31	64464.83	3
    20	32	124081.8	2
    21	33	110987.48	2
    22	34	106525.01	3
    23	35	269009.61	2
    24	36	131170.12	2
    25	37	293152.28	2
    26	38	61209.1	        4
    27	39	149108.82	2
    28	40	412274.43	1
    29	41	101928.66	3
    30	42	96450.49	3
    31	43	63514.58	4
    32	44	54659.47	4
    33	45	122265.54	2
    34	46	73544.8	        3
    35	47	70288.37	3
    36	48	69656.88	3
    37	113	91540.88	3
    38	114	71070.25	3
    39	115	10505.43	4
    40	116	56622.64	4
    41	117	40338.17	4
    42	118	226875.98	2
    43	119	39241.12	4
    44	120	27333.37	4
    45	123	163865.43	2
    46	124	624.82	        4
    47	125	117214.73	2
    48	126	172907.76	2
    49	127	611329.86	1
    50	128	221494.36	2
    51	129	496483.76	1
    52	130	691798.97	1
    53	131	70645.28	3
    54	132	83353.36	3
    55	133	76010	        3
    56	136	8989.83	        4
    57	140	97259.84	3
    58	146	35771.66	4
    59	147	1685.89 	4
    60	148	72058.92	3
    


    (4). ROW_NUMBER
    row_number 从1开始,每个分组返回一个数字。

    select s.prod_id, sum(s.amount_sold),
    row_number()  over (  order by sum(s.amount_sold) desc) as row_number
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.prod_id
    order by s.prod_id; 
    


    结果:

       	PROD_ID	SUM(S.AMOUNT_SOLD)	ROW_NUMBER
    1	13	936197.53	7
    2	14	2733887.43	2
    3	15	1368317.88	5
    4	16	11.99	        60
    5	17	2239127.88	3
    6	18	5477218.04	1
    7	19	182670.35	20
    8	20	990525.95	6
    9	21	1535187.44	4
    10	22	31853.11	54
    11	23	85211.28	36
    12	24	163929.27	22
    13	25	522713.71	13
    14	26	567533.83	12
    15	27	107968.24	30
    16	28	644480.02	9
    17	29	578374.62	11
    18	30	59391.8  	48
    19	31	64464.83	45
    20	32	124081.8	26
    21	33	110987.48	29
    22	34	106525.01	31
    23	35	269009.61	17
    24	36	131170.12	25
    25	37	293152.28	16
    26	38	61209.1 	47
    27	39	149108.82	24
    28	40	412274.43	15
    29	41	101928.66	32
    30	42	96450.49	34
    31	43	63514.58	46
    32	44	54659.47	50
    33	45	122265.54	27
    34	46	73544.8  	39
    35	47	70288.37	43
    36	48	69656.88	44
    37	113	91540.88	35
    38	114	71070.25	41
    39	115	10505.43	56
    40	116	56622.64	49
    41	117	40338.17	51
    42	118	226875.98	18
    43	119	39241.12	52
    44	120	27333.37	55
    45	123	163865.43	23
    46	124	624.82 	        59
    47	125	117214.73	28
    48	126	172907.76	21
    49	127	611329.86	10
    50	128	221494.36	19
    51	129	496483.76	14
    52	130	691798.97	8
    53	131	70645.28	42
    54	132	83353.36	37
    55	133	76010           38
    56	136	8989.83	        57
    57	140	97259.84	33
    58	146	35771.66	53
    59	147	1685.89	        58
    60	148	72058.92	40
    


    2、反百分点函数

    使用反百分比函数可以获得对应某个百分点的值。

    percenttile_disc、percentile_cont  与 cume_dist、percent_rank 作用相反。

    select percentile_cont(0.6) within group (  order by sum(s.amount_sold) desc) as percentile_cont,
    percentile_disc(0.6) within group (  order by sum(s.amount_sold) desc) as percentile_disc
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.prod_id
    order by s.prod_id; 


    结果:

       	PERCENTILE_CONT	PERCENTILE_DISC
    1	84468.112	85211.28


    3、窗口函数

    (1)、累积和: rows between unbounded preceding and current row

    计算某年1月到12月累计销量:

    select s.month_id, sum(s.amount_sold), 
    sum(sum(s.amount_sold)) over 
               (order by s.month_id  rows between unbounded preceding and current row)  
               as cumulative_amount
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.month_id
    order by s.month_id; 

    结果

       	MONTH_ID	SUM(S.AMOUNT_SOLD)	CUMULATIVE_AMOUNT
    1	01	2277420.49	2277420.49
    2	02	2372690.87	4650111.36
    3	03	1830572.64	6480684
    4	04	1975978.3	8456662.3
    5	05	1748287.23	10204949.53
    6	06	1869728.61	12074678.14
    7	07	1932282.28	14006960.42
    8	08	1972532.63	15979493.05
    9	09	2167008.19	18146501.24
    10	10	2236464.53	20382965.77
    11	11	1959664.22	22342629.99
    12	12	1741284.96	24083914.95
    

    (2)、移动平均值 rows between N preceding and current row

    某年当月与前三个月之间的移动平均值:

    select s.month_id, sum(s.amount_sold), 
    avg(sum(s.amount_sold)) over 
               (order by s.month_id  rows between 3 preceding and current row)  
               as moving_avg
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.month_id
    order by s.month_id; 


    结果:

    1	01	2277420.49	2277420.49
    2	02	2372690.87	2325055.68
    3	03	1830572.64	2160228
    4	04	1975978.3	2114165.575
    5	05	1748287.23	1981882.26
    6	06	1869728.61	1856141.695
    7	07	1932282.28	1881569.105
    8	08	1972532.63	1880707.6875
    9	09	2167008.19	1985387.9275
    10	10	2236464.53	2077071.9075
    11	11	1959664.22	2083917.3925
    12	12	1741284.96	2026105.475
    


    (3)、中心平均值: between N preceding and N following

    计算当月与前后1月的中心平均值

    select s.month_id, sum(s.amount_sold), 
    avg(sum(s.amount_sold)) over 
               (order by s.month_id  rows between 1 preceding and 1 following )  
               as moving_avg
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.month_id
    order by s.month_id; 


    结果:

       	MONTH_ID	SUM(S.AMOUNT_SOLD)	MOVING_AVG
    1	01	2277420.49	2325055.68
    2	02	2372690.87	2160228
    3	03	1830572.64	2059747.27
    4	04	1975978.3	1851612.72333333
    5	05	1748287.23	1864664.71333333
    6	06	1869728.61	1850099.37333333
    7	07	1932282.28	1924847.84
    8	08	1972532.63	2023941.03333333
    9	09	2167008.19	2125335.11666667
    10	10	2236464.53	2121045.64666667
    11	11	1959664.22	1979137.90333333
    12	12	1741284.96	1850474.59
    


    (4)、FIRST_VALUE 、 LAST_VALUE

    FIRST_VALUE 、 LAST_VALUE 获取窗口的首行、末行。

    select s.month_id, sum(s.amount_sold), 
    first_value(sum(s.amount_sold)) over 
               (order by s.month_id  rows between 1 preceding and 1 following )  
               as last_month_amount,
    last_value(sum(s.amount_sold)) over 
               (order by s.month_id  rows between 1 preceding and 1 following )  
               as next_month_amount           
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.month_id
    order by s.month_id; 


    结果:

       	MONTH_ID	SUM(S.AMOUNT_SOLD)	LAST_MONTH_AMOUNT	NEXT_MONTH_AMOUNT
    1	01	2277420.49	2277420.49	2372690.87
    2	02	2372690.87	2277420.49	1830572.64
    3	03	1830572.64	2372690.87	1975978.3
    4	04	1975978.3	1830572.64	1748287.23
    5	05	1748287.23	1975978.3	1869728.61
    6	06	1869728.61	1748287.23	1932282.28
    7	07	1932282.28	1869728.61	1972532.63
    8	08	1972532.63	1932282.28	2167008.19
    9	09	2167008.19	1972532.63	2236464.53
    10	10	2236464.53	2167008.19	1959664.22
    11	11	1959664.22	2236464.53	1741284.96
    12	12	1741284.96	1959664.22	1741284.96
    


    4、报表函数

    包括 sum、avg、max、min、count、variance、stddev、ratio_to_report等。


    (1)、总计报表

    计算每月销量总和以及所有产品销量总和

    select s.prod_id, s.month_id, 
    sum(sum(s.amount_sold)) over (partition by s.month_id) as month_total,
    sum(sum(s.amount_sold)) over (partition by s.prod_id) as prod_total
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.prod_id, s.month_id
    order by s.prod_id, s.month_id; 


    结果:

       , PROD_ID, MONTH_ID, MONTH_TOTAL, PROD_TOTAL
    1  13  01  2277420.49  936197.53
    2  13  02  2372690.87  936197.53
    3  13  03  1830572.64  936197.53
    4  13  04  1975978.3    936197.53
    5  13  05  1748287.23  936197.53
    6  13  06  1869728.61  936197.53
    7  13  07  1932282.28  936197.53
    8  13  08  1972532.63  936197.53
    9  13  10  2236464.53  936197.53
    10  13  12  1741284.96  936197.53
    11  14  01  2277420.49  2733887.43
    12  14  02  2372690.87  2733887.43
    13  14  03  1830572.64  2733887.43
    14  14  04  1975978.3  2733887.43
    15  14  05  1748287.23  2733887.43
    16  14  06  1869728.61  2733887.43
    17  14  07  1932282.28  2733887.43
    18  14  08  1972532.63  2733887.43
    19  14  09  2167008.19  2733887.43
    20  14  10  2236464.53  2733887.43
    21  14  11  1959664.22  2733887.43
    22  14  12  1741284.96  2733887.43
    23  15  01  2277420.49  1368317.88
    24  15  02  2372690.87  1368317.88
    25  15  03  1830572.64  1368317.88
    26  15  04  1975978.3  1368317.88
    27  15  05  1748287.23  1368317.88
    28  15  06  1869728.61  1368317.88
    29  15  07  1932282.28  1368317.88
    30  15  08  1972532.63  1368317.88
    31  15  09  2167008.19  1368317.88
    32  15  10  2236464.53  1368317.88
    33  15  11  1959664.22  1368317.88
    34  15  12  1741284.96  1368317.88
    35  16  03  1830572.64  11.99
    36  17  01  2277420.49  2239127.88
    37  17  02  2372690.87  2239127.88
    38  17  03  1830572.64  2239127.88
    39  17  04  1975978.3  2239127.88
    40  17  05  1748287.23  2239127.88
    41  17  06  1869728.61  2239127.88
    42  17  07  1932282.28  2239127.88
    43  17  08  1972532.63  2239127.88
    44  17  09  2167008.19  2239127.88
    45  17  10  2236464.53  2239127.88
    46  17  11  1959664.22  2239127.88
    47  17  12  1741284.96  2239127.88
    48  18  01  2277420.49  5477218.04
    49  18  02  2372690.87  5477218.04
    50  18  03  1830572.64  5477218.04
    51  18  04  1975978.3  5477218.04
    52  18  05  1748287.23  5477218.04
    53  18  06  1869728.61  5477218.04
    54  18  07  1932282.28  5477218.04
    55  18  08  1972532.63  5477218.04
    56  18  09  2167008.19  5477218.04
    57  18  10  2236464.53  5477218.04
    58  18  11  1959664.22  5477218.04
    59  18  12  1741284.96  5477218.04
    60  19  01  2277420.49  182670.35


    (2)、RATIO_TO_REPORT 函数

    RATIO_TO_REPORT 可用来计算某个值在一组值的总和中所占的比例。

    计算每个产品每个月销量总和,以及该类型在整月中的比例。

    select s.prod_id, s.month_id, 
    sum(s.amount_sold) as prod_month_total,
    ratio_to_report(sum(s.amount_sold)) over (partition by s.month_id) as prod_ratio
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.prod_id, s.month_id
    order by s.prod_id, s.month_id; 
     

    结果:


       	PROD_ID	MONTH_ID	PROD_MONTH_TOTAL	PROD_RATIO
    1	13	01	125575.64	0.0551394178419814
    2	13	02	122325.21	0.0515554771785336
    3	13	03	61649.5	        0.0336777130024187
    4	13	04	17404.26	0.00880792061329823
    5	13	05	61649.5 	0.0352627983217609
    6	13	06	20004	        0.0106988789137692
    7	13	07	164719.38	0.085246023163862
    8	13	08	125010.33	0.0633755447685547
    9	13	10	112205.27	0.0501708247525839
    10	13	12	125654.44	0.0721619050795684
    11	14	01	239773.24	0.105282814944727
    12	14	02	278879.97	0.117537422816483
    


    5、LAG与LEAD函数

    LAG 与 LEAD 函数获得位于距当前指定距离处那条记录中的数据。

    如获得前一个月 与后一个月的数据:

    select s.month_id, sum(s.amount_sold), 
    lag(sum(s.amount_sold)) over (  order by s.month_id) as last_month_sum,
    lead(sum(s.amount_sold)) over ( order by s.month_id) as next_month_sum
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.month_id
    order by s.month_id; 

    结果:

       MONTH_ID	SUM(S.AMOUNT_SOLD) LAST_MONTH_SUM  NEXT_MONTH_SUM
    1	01	2277420.49		        2372690.87
    2	02	2372690.87	2277420.49	1830572.64
    3	03	1830572.64	2372690.87	1975978.3
    4	04	1975978.3	1830572.64	1748287.23
    5	05	1748287.23	1975978.3	1869728.61
    6	06	1869728.61	1748287.23	1932282.28
    7	07	1932282.28	1869728.61	1972532.63
    8	08	1972532.63	1932282.28	2167008.19
    9	09	2167008.19	1972532.63	2236464.53
    10	10	2236464.53	2167008.19	1959664.22
    11	11	1959664.22	2236464.53	1741284.96
    12	12	1741284.96	1959664.22	
    


    6、FIRST函数与LAST函数

    first 和 last 返回排序分组中的第一个或者最后一个值。

    如查询某年销量最高的、最低的月份。

    select min(s.month_id) keep (dense_rank first order by sum(s.amount_sold)) as min_month,
    max(s.month_id) keep (dense_rank last order by sum(s.amount_sold)) as max_month
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.month_id
    order by s.month_id; 


    结果:

        MIN_MONTHMAX_MONTH
    112                02


    7、线性回归函数



    8、假设评级及分布函数

    假象评级与分布函数可以计算一条新记录在表中的排名和百分比,与 rank(),dense_rank(), percent_rank(),  cume_dist() 等连用。

    例如,查找 sum(amount) 为20004 的假想排名和百分比排名

    select rank(20004) within group (  order by sum(s.amount_sold) desc) as rank,
    percent_rank(20004) within group (  order by sum(s.amount_sold) desc) as pencent_rank
    from all_sales s where s.year=1998 and s.amount_sold is not null
    group by s.prod_id
    order by s.prod_id; 

    结果:

       RANK PENCENT_RANK
    1560.916666666666667



  • 相关阅读:
    Ubuntu上搭建Watir-Webdriver与Cucumber环境
    使--no-ri --no-rdoc成为gem安装的默认选项
    Ruby require 路径问题
    【原创】LoadRunner Java Vuser脚本的配置和调试指南
    【原创】LoadRunner Java Vuser开发环境配置指南
    【原创】使用Nmon_Analyzer处理较大nmon文件的方法
    2014年,马上要上班啦,希望一切顺利
    关于.jar的文件在cmd中无法连接数据库的问题
    ios 的通知机制
    ios开发小技巧-用宏化简代码
  • 原文地址:https://www.cnblogs.com/leeeee/p/7276071.html
Copyright © 2011-2022 走看看