zoukankan      html  css  js  c++  java
  • 窗口函数使用举例

    窗口函数对于一些统计应用情况有非常好的使用效果,本文主要举例使用常用的几种窗口函数。

    定义

      A window function performs a calculation across a set of table rows that are somehow related to the current row.

    数据准备

    CREATE TABLE empsalary(
      depname varchar,
      empno bigint,
      salary int,
      enroll_date date
    );
    INSERT INTO empsalary VALUES('develop',10, 5200, '2007/08/01');
    INSERT INTO empsalary VALUES('sales', 1, 5000, '2006/10/01');
    INSERT INTO empsalary VALUES('personnel', 5, 3500, '2007/12/10');
    INSERT INTO empsalary VALUES('sales', 4, 4800, '2007/08/08');
    INSERT INTO empsalary VALUES('sales', 6, 5500, '2007/01/02');
    INSERT INTO empsalary VALUES('personnel', 2, 3900, '2006/12/23');
    INSERT INTO empsalary VALUES('develop', 7, 4200, '2008/01/01');
    INSERT INTO empsalary VALUES('develop', 9, 4500, '2008/01/01');
    INSERT INTO empsalary VALUES('sales', 3, 4800, '2007/08/01');
    INSERT INTO empsalary VALUES('develop', 8, 6000, '2006/10/01');
    INSERT INTO empsalary VALUES('develop', 11, 5200, '2007/08/15');

    操作

    1.row_number():返回行号,对比值重复时行号不重复不间断, 即返回 1,2,3,4,5....,不返回 1,2,2,4...

    select row_number() over(),* from empsalary limit 2;

     

    select row_number() over(),* from empsalary limit 2 offset 2;

     

    --按depname分组,salary排序,注意红色记录行号不间断
    select row_number() over(partition by depname order by salary),* from empsalary;

    2.rank():返回行号,对比值重复时行号重复并间断, 即返回 1,2,2,4...

    select rank() over(partition by depname order by salary),* from empsalary;

    3.dense_rank():返回行号,对比值重复时行号重复但不间断, 即返回 1,2,2,3

    select dense_rank() over(partition by depname order by salary),* from empsalary;

    4.percent_rank():从当前开始, 计算在分组中的比例 (行号-1)*(1/(总记录数-1))

    select percent_rank() over(partition by depname order by salary),* from empsalary;

    5.cume_dist() :返回行数除以记录数值

    select ROUND((cume_dist() over(partition by depname order by salary))::NUMERIC,2) AS cume_dist,* from empsalary;

    6.ntile(分组数量):让所有记录按分组数目尽可以的均匀分布

    select ntile(3) over(partition by depname order by salary),* from empsalary;

    7.lag(value any [, offset integer [, default any ]]):返回偏移量值, offset integer 是偏移值, 正数时前值, 负数时后值, 没有取到值时用 default 代替;所谓正偏移即当前行的值salary按照偏移量offset偏移到下面对应的行,负偏移同。见红色标识

    select lag(salary,1,null) over(partition by depname order by enroll_date),* from empsalary;

    8.lead(value any [, offset integer [, default any ]]):返回偏移量值, offset integer 是偏移值, 正数时取后值,负数时取前值, 没有取到值时用 default 代替。

    select lead(salary,1,2) over(partition by depname order by enroll_date),* from empsalary;

     

    9.first_value(value any)返回第一值

    select first_value(salary) over(partition by depname order by enroll_date),* from empsalary;

    10.last_value(value any)返回最后值

    select last_value(salary) over(partition by depname order by enroll_date),* from empsalary;

    从返回结果看,似乎有问题,默认情况下, 带了 order by 参数会从分组的起始值开始一直叠加, 直到当前值不同为止,通过修改分组的统计范围就可以实现 order by 参数取最后值.

    select last_value(salary) over(partition by depname order by enroll_date range between unbounded preceding and unbounded following),* FROM empsalary;

     11.nth_value(value any, nth integer):返回窗口框架中的指定值,如nth_value(salary,2),则表示返回字段salary的第二个窗口函数值

    select nth_value(salary,2) over(partition by depname order by enroll_date),* from empsalary;

    12.同时调用多个窗口函数可用下面别名简化写法

    select sum(salary) over w,avg(salary) over w,* from empsalary window w as (partition by depname order by enroll_date);
    与下面写法同:
    select sum(salary) over(partition by depname order by enroll_date),avg(salary) over(partition by depname order by enroll_date),* from empsalary;

     
  • 相关阅读:
    mouseover 有一个多次触发的问题
    2019牛客多校第一场 H.XOR
    luoguP4570 [BJWC2011]元素(线性基)
    线性基
    2019牛客多校第七场 E.Find the median
    2019牛客多校第七场
    2019牛客多校第五场
    支配树
    2019牛客多校第四场
    Codeforces 1195E OpenStreetMap(单调队列)
  • 原文地址:https://www.cnblogs.com/space-place/p/6594424.html
Copyright © 2011-2022 走看看