zoukankan      html  css  js  c++  java
  • 开发随笔——NOT IN vs NOT EXISTS

    原文出处: http://blog.csdn.net/dba_huangzj/article/details/31374037  转载请引用

    之前在论坛中见到一个针对in/exists的讨论。原帖懒得找了,这里介绍一下近期的学习小结:

    NOT IN和NOT EIXTS在对同意为null的列查询时会有一定的风险。特别是NOT IN,假设子查询包括了最少一个NULL,会出现非预期的结果。以下做一个演示。

    IF OBJECT_ID('ShipmentItems', 'U') IS NOT NULL 
        DROP TABLE dbo.ShipmentItems; 
    GO 
    CREATE TABLE dbo.ShipmentItems 
        ( 
          ShipmentBarcode VARCHAR(30) NOT NULL , 
          Description VARCHAR(100) NULL , 
          Barcode VARCHAR(30) NOT NULL 
        ); 
    GO 
    INSERT  INTO dbo.ShipmentItems 
            ( ShipmentBarcode , 
              Barcode , 
              Description 
            ) 
            SELECT  '123456' , 
                    '1010203' , 
                    'Some cool widget' 
            UNION ALL 
            SELECT  '123654' , 
                    '1010203' , 
                    'Some cool widget' 
            UNION ALL 
            SELECT  '123654' , 
                    '1010204' , 
                    'Some cool stuff for some gadget'; 
    GO 
    -- retrieve all the items from shipment 123654 
    -- that are not shipped in shipment 123456 
    SELECT  Barcode 
    FROM    dbo.ShipmentItems 
    WHERE   ShipmentBarcode = '123654' 
            AND Barcode NOT IN ( SELECT Barcode 
                                 FROM   dbo.ShipmentItems 
                                 WHERE  ShipmentBarcode = '123456' ); 
    /* 
    Barcode 
    ------------------------------ 
    1010204
    */


    能够看出得到了期待结果。以下看看改动表结构,同意列为null的情况:

    ALTER TABLE dbo.ShipmentItems 
    ALTER COLUMN Barcode VARCHAR(30) NULL; 
    INSERT  INTO dbo.ShipmentItems 
            ( ShipmentBarcode , 
              Barcode , 
              Description 
            ) 
            SELECT  '123456' , 
                    NULL , 
                    'Users manual for some gadget'; 
    GO 
    SELECT  Barcode 
    FROM    dbo.ShipmentItems 
    WHERE   ShipmentBarcode = '123654' 
            AND Barcode NOT IN ( SELECT Barcode 
                                 FROM   dbo.ShipmentItems 
                                 WHERE  ShipmentBarcode = '123456' ); 
    /* 
    Barcode 
    ------------------------------
    */


    非常多人会认为这是一个bug,有时候能查出数据。有时候却不能。可是实际上不是bug,当NOT IN子句返回最少一个NULL时,查询会返回空。以下的语句能更好地说明这个想法:

    SELECT  CASE WHEN 1 NOT IN ( 2, 3 ) THEN 'True' 
                 ELSE 'Unknown or False' 
            END , 
            CASE WHEN 1 NOT IN ( 2, 3, NULL ) THEN 'True' 
                 ELSE 'Unknown or False' 
            END; 
    /* 
    ---- ---------------- 
    True Unknown or False 
    */


    实际上,因为IN的本质是OR操作,所以:


    SELECT  CASE WHEN 1 IN ( 1, 2, NULL ) THEN 'True' 
                 ELSE 'Unknown or False' 
            END ;

    中,1 in 1。也就是为TRUE,所以返回true。这个语句的逻辑实际上是:

    SELECT  CASE WHEN ( 1 = 1 ) 
                      OR ( 1 = 2 ) 
                      OR ( 1 = NULL ) THEN 'True' 
                 ELSE 'Unknown or False' 
            END ;
     


    当使用NOT IN 时,如以下的语句:

    SELECT  CASE WHEN 1 NOT IN ( 1, 2, NULL ) THEN 'True' 
                 ELSE 'Unknown or False' 
            END ;


    会转变成:

    SELECT  CASE WHEN NOT ( ( 1 = 1 ) 
                            OR ( 1 = 2 ) 
                            OR ( 1 = NULL ) 
                          ) THEN 'True' 
                 ELSE 'Unknown or False' END ;


    依据离散数学的概念,能够转换为:

    SELECT  CASE WHEN ( ( 1 <> 1 ) 
                        AND ( 1 <> 2 ) 
                        AND ( 1 <> NULL ) 
                      ) THEN 'True' 
                 ELSE 'Unknown or False' 
            END ;


    谓词有短路特性,即在AND条件中。仅仅要有一个条件为false,整个条件都为false,而1<>1是为false,所以后面的也不须要推断了,直接返回else部分。即使是1<>null,依据集合论的特性,NULL和实际数据的对照总是返回unknown,所以也是为false。假设你非要用NOT IN 。请确保子查询永远不会有NULL返回。或者须要额外处理去除NULL,比方:

    SELECT  Barcode 
    FROM    dbo.ShipmentItems 
    WHERE   ShipmentBarcode = '123654' 
      AND Barcode NOT IN ( SELECT Barcode 
                           FROM   dbo.ShipmentItems 
                           WHERE  ShipmentBarcode = '123456' 
                             AND Barcode IS NOT NULL ) ;


    另一种方法就是改写语句,用NOT EXISTS来等价替换:

    SELECT  i.Barcode 
    FROM    dbo.ShipmentItems AS i 
    WHERE   i.ShipmentBarcode = '123654' 
            AND NOT EXISTS ( SELECT * 
                             FROM   dbo.ShipmentItems AS i1 
                             WHERE  i1.ShipmentBarcode = '123456' 
                                    AND i1.Barcode = i.Barcode ); 
    /* 
    Barcode 
    ------------------------------ 
    1010204
    */


    另外,基于SARG要求,一般不建议用NOT IN/NOT EXISTS这样的反向扫描,避免影响性能。另一个选择使用IN/EXISTS的要点。就是多列匹配的问题,在T-SQL中,多列同一时候匹配要用EXISTS,而单列匹配能够用EXISTS/IN。可能能够用其它写法来实现IN的多列匹配,可是一般我个人会选择使用EXISTS来匹配多列。


    原文出自:CSDN博客:黄钊吉的博客

  • 相关阅读:
    取得GridPanel选定行所有字段值
    从少林寺的核心竞争力看软件作坊和正规军的差异
    估算软件项目成本
    Delphi FastReport报表常用方法
    少林寺的组织结构分析
    embed标签使用
    C# 多种方式播放Wav声音
    js 关闭浏览器
    ExtJS GridPanel根据条件改变字体颜色
    [Jsp] 如何在JSP页面快速输出从Servlet接收的数据_看JSTL常用C标签的威力
  • 原文地址:https://www.cnblogs.com/yangykaifa/p/6957823.html
Copyright © 2011-2022 走看看