zoukankan      html  css  js  c++  java
  • 使用4中不同的方式找出连续三年获奖的人

    其实这个问题就是数据中的“island”孤岛数据的问题,只是要求连续长度是3。首先避开这个问题不谈,看看下来的数字:

    1 2 3 5 7 8 9 11 13 18 22 27 28 29 30 40 100 102 107 108 109

    现在有如下需求:找出连续的数字开始和结束的位置。

    很显然,比划比划这个结果就是:

    1    3
    7    9
    27    30
    107    109

    那么现在寻找一种方式,把这个结果用sql查出来。首先造数据:

    ;WITH Num(N) AS(
    	SELECT 1 UNION ALL
    	SELECT 2 UNION ALL
    	SELECT 3 UNION ALL
    	SELECT 5 UNION ALL
    	SELECT 7 UNION ALL
    	SELECT 8 UNION ALL
    	SELECT 9 UNION ALL
    	SELECT 11 UNION ALL
    	SELECT 13 UNION ALL
    	SELECT 18 UNION ALL
    	SELECT 22 UNION ALL
    	SELECT 27 UNION ALL
    	SELECT 28 UNION ALL
    	SELECT 29 UNION ALL
    	SELECT 30 UNION ALL
    	SELECT 40 UNION ALL
    	SELECT 100 UNION ALL
    	SELECT 102 UNION ALL
    	SELECT 107 UNION ALL
    	SELECT 108 UNION ALL
    	SELECT 109)

    先把这些数字当中连续的边界找出来(1,3,5,7,9,11,13,18,22,27,30,40,100,102,107,109)。查找的条件如下:不存在比当前小的数或者比当前大的数。如下:

    cte1(N) AS(		
    	SELECT N
    	FROM Num n1
    	WHERE NOT EXISTS(
    			SELECT 'x'
    			FROM Num n2
    			WHERE n2.N = n1.N - 1)
    		OR
    		NOT EXISTS(
    			SELECT 'x'
    			FROM Num n3
    			WHERE n3.N = n1.N + 1))

    既然找到了边界,接下来我对两个边界进行配对,只不过这对值的第二值是比第一边界值大但是最小的那个。

    SELECT N, (
    		SELECT MIN(N)
    		FROM cte1 c2
    		WHERE c2.N > c1.N
    		) Big
    	FROM cte1 c1

    配对成功,接下了我只需要找存在配对当中的行就可以了。如下:

    SELECT *, Big - N  + 1 AS count
    FROM(
    	SELECT N, (
    		SELECT MIN(N)
    		FROM cte1 c2
    		WHERE c2.N > c1.N
    		) Big
    	FROM cte1 c1)
     AS dr1
    WHERE EXISTS(
    	SELECT 'x'
    	FROM Num c1
    	WHERE c1.N > dr1.N AND c1.N < dr1.Big
    )

    这样就得到想要的结果了。现在回过来看看“连续3年获奖的人”,用一样的原理可以得到第一种方式(最慢的方式)。

    首先造点需要的数据:

    ;WITH Prize(Name, Year) AS(
    	SELECT 'Kim', '2001' UNION ALL
    	SELECT 'Tim', '2002' UNION ALL
    	SELECT 'Tang', '2002' UNION ALL
    	SELECT 'Jack', '2001' UNION ALL
    	SELECT 'Juicy', '2001' UNION ALL
    	SELECT 'Peff', '2002' UNION All
    	SELECT 'Juicy', '2002' UNION ALL
    	SELECT 'Kim', '2002' UNION ALL
    	SELECT 'Juicy', '2003' UNION ALL
    	SELECT 'Peff', '2003' UNION ALL
    	SELECT 'Kim', '2003' UNION ALL
    	SELECT 'Tim', '2004' UNION ALL
    	SELECT 'Juicy', '2004' UNION ALL
    	SELECT 'Jack', '2005' UNION ALL
    	SELECT 'Eric', '2005' UNION ALL
    	SELECT 'Tim', '2005' UNION ALL
    	SELECT 'Eric', '2006' UNION ALL
    	SELECT 'Peff', '2006' UNION ALL
    	SELECT 'Juicy', '2007' UNION ALL
    	SELECT 'Eric', '2007' UNION ALL
    	SELECT 'Tang', '2007' UNION ALL
    	SELECT 'Tang', '2008' UNION ALL
    	SELECT 'Peff', '2008' UNION ALL
    	SELECT 'Kim', '2008' UNION ALL
    	SELECT 'Jack', '2009' UNION ALL
    	SELECT 'Tang', '2009' UNION ALL
    	SELECT 'Kim', '2009')
    第一种方式为:
    ,cte1 AS(
    SELECT * 
    FROM Prize p1
    WHERE NOT EXISTS(
    	SELECT 'x'
    	FROM Prize p2
    	WHERE p1.Name = p2.Name AND p2.Year = p1.Year - 1) 
    	OR
    	NOT EXISTS(
    	SELECT 'x'
    	FROM Prize p3
    	WHERE p3.Name = p1.Name AND p3.Year = p1.Year + 1)
    )
    
    SELECT *
    FROM (
    	SELECT Name, Year, (
    		SELECT MIN(Year)
    		FROM cte1 c2
    		WHERE c2.Name = c1.Name AND c2.Year > c1.Year
    	) AS Bigger 
    	FROM cte1 c1
    	--ORDER BY Name, Year
    ) AS Dr1
    WHERE EXISTS(
    	SELECT 'x'
    	FROM Prize p
    	WHERE p.Name = Dr1.Name AND p.Year > Dr1.Year AND p.Year < Dr1.Bigger
    ) 
    AND CAST(Bigger AS INT) - CAST(Year AS INT) + 1 >= 3
    ORDER BY Name, Year
    
    第二种方式,凡是连续三年获奖的人,那么year上一定存在三个连续的值,那么我就用两个exists条件去查询,如下:
    	SELECT DISTINCT Name
    	FROM Prize p1
    	WHERE EXISTS(
    		SELECT 'x'
    		FROM Prize p2
    		WHERE p2.Name = p1.Name AND p2.Year = p1.Year + 1)
    		AND EXISTS(
    		SELECT 'x'
    		FROM Prize p3
    		WHERE p3.Name = p1.Name AND p3.Year = p1.Year + 2)
    第三种方式,这种方式从《Microsoft SQL Server 2005技术内幕:T-SQL查询》中需来,博客园有篇文章对此总结,不再赘述。
    SELECT DISTINCT Name
    FROM (
    	SELECT  *, (SELECT MIN(p2.Year)
    		FROM Prize p2
    		WHERE p2.Name = p.Name and p2.Year >= p.Year AND
    			NOT EXISTS(
    				SElECT 'x'
    				FROM Prize p3
    				WHERE p3.Name = p2.Name AND p3.Year = p2.Year + 1)) Big
    	FROM Prize p
    	) AS Dr
    WHERE CAST(Dr.Big AS INT) - CAST(Dr.Year AS INT) + 1 >= 3
    第四种方式,如果我按照‘年’对表进行旋转,会得到什么结果?首先试试:
    	SELECT * 
    	FROM Prize P PIVOT(
    		COUNT(Year) FOR Year IN( [2001], [2002], [2003], [2004], [2005], [2006], [2007], [2008], [2009])
    		) AS PD

    那么我可以得到结果:

    _thumb3

    既然有了个结果,接下来就好处理了,我把三个连续的年列组成一组,凡是都为1的一定是3年都获奖的人。如下:
    SELECT Name
    FROM (
    	SELECT * 
    	FROM Prize P PIVOT(
    		COUNT(Year) FOR Year IN( [2001], [2002], [2003], [2004], [2005], [2006], [2007], [2008], [2009])
    		) AS PD
    ) AS D
    WHERE (D.[2001] = 1 AND D.[2002] = 1 AND D.[2003] = 1) OR
    	  (D.[2002] = 1 AND D.[2003] = 1 AND D.[2004] = 1) OR
    	  (D.[2003] = 1 AND D.[2004] = 1 AND D.[2005] = 1) OR
    	  (D.[2004] = 1 AND D.[2005] = 1 AND D.[2006] = 1) OR
    	  (D.[2005] = 1 AND D.[2006] = 1 AND D.[2007] = 1) OR
    	  (D.[2006] = 1 AND D.[2007] = 1 AND D.[2008] = 1) OR
    	  (D.[2007] = 1 AND D.[2008] = 1 AND D.[2009] = 1)
    这种方式是非常快的,比以上任何方式都要快,但是有它的局限性,以上的[2001]……[2009]全部都是hard的。如过表中的年很多,需要在IN后面加上年份,然后在WHERE条件
    中进行连续3年的组合,如果表中有n个年份,那么where条件中一种有n-2个条件。有时为了效率,多写点代码是值得的。
  • 相关阅读:
    第二周作业
    第一周作业
    抓老鼠啊~亏了还是赚了?
    打印沙漏
    寒假作业2
    我与老师
    自我介绍
    第九周课程总结&实验报告(七)
    第八周课程总结&实验报告(六)
    第七周实验报告(五)&周总结
  • 原文地址:https://www.cnblogs.com/fgynew/p/1671575.html
Copyright © 2011-2022 走看看