zoukankan      html  css  js  c++  java
  • SQL进阶系列之7用SQL进行集合运算

    写在前面

    集合论是SQL语言的根基,因为这种特性,SQL也被称为面向集合语言

    导入篇:集合运算的几个注意事项

    • 注意事项1:SQL能操作具有重复行的集合(multiset、bag),可以通过可选项ALL来支持

      SQL的集合运算符提供了允许重复和不允许重复两种用法,UNION和INTERSECT结果里不会出现重复的行,UNION ALL则会保留重复行;ALL的作用和SELECT子句中的DISTINCT相反。ALL有助于优化查询性能,这是因为使用ALL后不再进行排序

    • 注意事项2:集合运算符存在优先级

      标准SQL规定:INTERSECT比UNION和EXCEPT优先级高

    • 注意事项3:各个DBMS提供商在集合运算上的实现程度不同

      MySQL不支持,Oracle使用MINUS替代EXCEPT

    • 注意事项4:除法运算没有标准定义

      四则运算的和(UNION)、差(EXCEPT)、积(CROSS JOIN)都被引入了标准SQL,商却迟迟没有进入SQL标准

    比较表和表:检查集合相等性之基础篇

    /* 比较表和表:检查集合相等性 */
    CREATE TABLE Tbl_A
     (keycol  CHAR(1) PRIMARY KEY,
      col_1   INTEGER, 
      col_2   INTEGER, 
      col_3   INTEGER);
    
    CREATE TABLE Tbl_B
     (keycol  CHAR(1) PRIMARY KEY,
      col_1   INTEGER, 
      col_2   INTEGER, 
      col_3   INTEGER);
    
    /* 表相等的情况 */
    DELETE FROM Tbl_A;
    INSERT INTO Tbl_A VALUES('A', 2, 3, 4);
    INSERT INTO Tbl_A VALUES('B', 0, 7, 9);
    INSERT INTO Tbl_A VALUES('C', 5, 1, 6);
    
    DELETE FROM Tbl_B;
    INSERT INTO Tbl_B VALUES('A', 2, 3, 4);
    INSERT INTO Tbl_B VALUES('B', 0, 7, 9);
    INSERT INTO Tbl_B VALUES('C', 5, 1, 6);
    
    /* B行不同的情况 */
    DELETE FROM Tbl_A;
    INSERT INTO Tbl_A VALUES('A', 2, 3, 4);
    INSERT INTO Tbl_A VALUES('B', 0, 7, 9);
    INSERT INTO Tbl_A VALUES('C', 5, 1, 6);
    
    DELETE FROM Tbl_B;
    INSERT INTO Tbl_B VALUES('A', 2, 3, 4);
    INSERT INTO Tbl_B VALUES('B', 0, 7, 8);
    INSERT INTO Tbl_B VALUES('C', 5, 1, 6);
    
    /* 包含NULL的情况(相等) */
    DELETE FROM Tbl_A;
    INSERT INTO Tbl_A VALUES('A', NULL, 3, 4);
    INSERT INTO Tbl_A VALUES('B', 0, 7, 9);
    INSERT INTO Tbl_A VALUES('C', NULL, NULL, NULL);
    
    DELETE FROM Tbl_B;
    INSERT INTO Tbl_B VALUES('A', NULL, 3, 4);
    INSERT INTO Tbl_B VALUES('B', 0, 7, 9);
    INSERT INTO Tbl_B VALUES('C', NULL, NULL, NULL);
    
    /* 包含NULL的情况(C行不同) */
    DELETE FROM Tbl_A;
    INSERT INTO Tbl_A VALUES('A', NULL, 3, 4);
    INSERT INTO Tbl_A VALUES('B', 0, 7, 9);
    INSERT INTO Tbl_A VALUES('C', NULL, NULL, NULL);
    
    DELETE FROM Tbl_B;
    INSERT INTO Tbl_B VALUES('A', NULL, 3, 4);
    INSERT INTO Tbl_B VALUES('B', 0, 7, 9);
    INSERT INTO Tbl_B VALUES('C', 0, NULL, NULL);
    

    如果两表相同,有如下逻辑:A UNION B = A = B ;还有 A (cup) B = A (cap) B

    -- 判断两表是否完全相等(判断之前可以看看行数相不相同)
    SELECT COUNT(*) AS row_cnt FROM 
    ((SELECT * FROM Tbl_A ) 
    UNION 
    (SELECT * FROM Tbl_B)) AS Total;
    

    对上面的表,我们发现对任意的表S都有如下的公式成立:S UNION S = S 这称之为幂等性,同一个集合加多少次结果都相同。

    比较表和表:检查集合相等性之进阶篇

    集合论里判断两个集合相等一般使用下面两个方法:

    • A (subset) B 且 A (supset) B (Leftrightarrow) A = B
    • A (cup) B = A (cap) B (Leftrightarrow) A = B
    -- A union B = A intersect B means A = B,不难发现intersect也是一个幂等运算符
    -- 两张表相等时返回"相等",否则返回"不相等"
    SELECT CASE WHEN COUNT(*) = 0 
                THEN '相等' ELSE '不相等' END AS result
    FROM ((SELECT * FROM tbl_A) UNION (SELECT * FROM tbl_B) 
          EXCEPT 
          (SELECT * FROM tbl_A) INTERSECT (SELECT * FROM tbl_B)) AS TMP;
    
    -- 查看量表不一样的记录
    (SELECT * FROM Tbl_A EXCEPT SELECT * FROM Tbl_B)
    UNION ALL
    (SELECT * FROM Tbl_B EXCEPT SELECT * FROM Tbl_A);
    

    用差集实现关系除法运算

    • 嵌套使用NOT EXISTS
    • 使用HAVING子句转换成一对一关系
    • 把重复变成减法
    -- 建表语句
    /* 用差集实现关系除法运算 */
    CREATE TABLE Skills 
    (skill VARCHAR(32),
     PRIMARY KEY(skill));
    
    CREATE TABLE EmpSkills 
    (emp   VARCHAR(32), 
     skill VARCHAR(32),
     PRIMARY KEY(emp, skill));
    
    INSERT INTO Skills VALUES('Oracle');
    INSERT INTO Skills VALUES('UNIX');
    INSERT INTO Skills VALUES('Java');
    
    INSERT INTO EmpSkills VALUES('相田', 'Oracle');
    INSERT INTO EmpSkills VALUES('相田', 'UNIX');
    INSERT INTO EmpSkills VALUES('相田', 'Java');
    INSERT INTO EmpSkills VALUES('相田', 'C#');
    INSERT INTO EmpSkills VALUES('神崎', 'Oracle');
    INSERT INTO EmpSkills VALUES('神崎', 'UNIX');
    INSERT INTO EmpSkills VALUES('神崎', 'Java');
    INSERT INTO EmpSkills VALUES('平井', 'UNIX');
    INSERT INTO EmpSkills VALUES('平井', 'Oracle');
    INSERT INTO EmpSkills VALUES('平井', 'PHP');
    INSERT INTO EmpSkills VALUES('平井', 'Perl');
    INSERT INTO EmpSkills VALUES('平井', 'C++');
    INSERT INTO EmpSkills VALUES('若田部', 'Perl');
    INSERT INTO EmpSkills VALUES('渡来', 'Oracle');
    
    -- 用求差集的方法进行关系除法运算(有余数)
    SELECT DISTINCT emp 
    FROM EmpSkills ES1
    WHERE NOT EXISTS 
    (SELECT skill FROM Skills EXCEPT SELECT skill FROM EmpSkills ES2 WHERE ES1.emp = ES2.emp);
    

    寻找相等的子集

    /* 4.寻找相等的子集 */
    CREATE TABLE SupParts
    (sup  CHAR(32) NOT NULL,
     part CHAR(32) NOT NULL,
     PRIMARY KEY(sup, part));
    
    INSERT INTO SupParts VALUES('A',  '螺丝');
    INSERT INTO SupParts VALUES('A',  '螺母');
    INSERT INTO SupParts VALUES('A',  '管子');
    INSERT INTO SupParts VALUES('B',  '螺丝');
    INSERT INTO SupParts VALUES('B',  '管子');
    INSERT INTO SupParts VALUES('C',  '螺丝');
    INSERT INTO SupParts VALUES('C',  '螺母');
    INSERT INTO SupParts VALUES('C',  '管子');
    INSERT INTO SupParts VALUES('D',  '螺丝');
    INSERT INTO SupParts VALUES('D',  '管子');
    INSERT INTO SupParts VALUES('E',  '保险丝');
    INSERT INTO SupParts VALUES('E',  '螺母');
    INSERT INTO SupParts VALUES('E',  '管子');
    INSERT INTO SupParts VALUES('F',  '保险丝');
    
    -- 生成供应商的全部组合
    SELECT SP1.sup AS s1,SP2.sup AS s2
    FROM SupParts SP1,SupParts SP2
    WHERE SP1.sup < SP2.sup
    GROUP BY SP1.sup,SP2.sup;
    
    SELECT SP1.sup AS s1,SP2.sup AS s2
    FROM SupParts SP1,SupParts SP2
    WHERE SP1.sup < SP2.sup
    AND SP1.part = SP2.part
    GROUP BY SP1.sup,SP2.sup
    HAVING COUNT(*) = (SELECT COUNT(*) FROM SupParts SP3 WHERE SP3.sup = SP1.sup)
    AND COUNT(*) = (SELECT COUNT(*) FROM SupParts SP4 WHERE SP4.sup = SP2.sup);
    

    用于删除重复行的高效SQL

    /* 5.用于删除重复行的高效SQL */
    /* 在PostgreSQL中,需要把“with oids”添加到CREATE TABLE语句的最后 */
    CREATE TABLE Products
    (name  CHAR(16),
     price INTEGER);
    
    INSERT INTO Products VALUES('苹果',  50);
    INSERT INTO Products VALUES('橘子', 100);
    INSERT INTO Products VALUES('橘子', 100);
    INSERT INTO Products VALUES('橘子', 100);
    INSERT INTO Products VALUES('香蕉',  80);
    
    -- 删除重复行:使用关联子查询
    DELETE FROM Products
    WHERE rowid < (SELECT MAX(P2.rowid) FROM Products P2 WHERE Products.name = P2.name AND Product.price = P2.price);
    
    -- 用于删除重复行的高效SQL语句(1):通过EXCEPT求补集
    DELETE FROM Products
    WHERE rowid IN (SELECT rowid FROM Products EXCEPT SELECT MAX(rowid) FROM Products GROUP BY name,price)
    
    -- 用于删除重复行的高效SQL语句(2):通过NOT IN求补集
    DELETE FROM Products
    WHERE rowid  NOT IN (SELECT MAX(rowid) FROM Products GROUP BY name,price)
    

    小结

    • 集合运算方面,SQL的标准化进行的非常缓慢,使用时需要注意
    • 如果集合运算符不指定ALL选项,重复行会被排除掉,而且这种情况下还会发生排序,所以性能方面不够好
    • UNION和INTERSECT都具有幂等性,而EXCEPT不具有幂等性
    • 标准SQL没有关系除法的运算符,需要自己实现
    • 判断两个集合是否相等,可以通过幂等性或一一映射两种方式
    • 使用EXCEPT可以很简单地求得补集

    练习题

    /* 练习题1-7-1:改进“只使用UNION的比较” */
    SELECT CASE WHEN COUNT(*) = (SELECT COUNT(*) FROM tbl_A )
                 AND COUNT(*) = (SELECT COUNT(*) FROM tbl_B )
                THEN '相等'
                ELSE '不相等' END AS result
      FROM ( SELECT * FROM tbl_A
             UNION
             SELECT * FROM tbl_B ) TMP;
    
    /* 练习题1-7-2:精确关系除法运算 */
    SELECT DISTINCT emp
      FROM EmpSkills ES1
     WHERE NOT EXISTS
            (SELECT skill
               FROM Skills
             EXCEPT
             SELECT skill
               FROM EmpSkills ES2
              WHERE ES1.emp = ES2.emp)
      AND NOT EXISTS
            (SELECT skill
               FROM EmpSkills ES3
              WHERE ES1.emp = ES3.emp
             EXCEPT
             SELECT skill
               FROM Skills );
    
    /* 练习题1-7-2:精确关系除法运算 */
    SELECT emp
      FROM EmpSkills ES1
     WHERE NOT EXISTS
            (SELECT skill
               FROM Skills
             EXCEPT
             SELECT skill
               FROM EmpSkills ES2
              WHERE ES1.emp = ES2.emp)
     GROUP BY emp
    HAVING COUNT(*) = (SELECT COUNT(*) FROM Skills);
    
  • 相关阅读:
    分页插件加MVC
    在ASP.NET MVC中,有使用angularjs
    EF比较权威的一篇
    WEBFORM中添加bootstrap套件
    MVC统一设置命名空间
    重新生成索引及重新组织索引
    Dapper.Contrib.Extensions问题
    API Test WebApiTestClient工具安装及使用
    API Test Postman接口测试之高级篇2
    API Test Postman接口测试之高级篇1
  • 原文地址:https://www.cnblogs.com/evian-jeff/p/11580818.html
Copyright © 2011-2022 走看看