zoukankan      html  css  js  c++  java
  • MYSQL Nested Join Optimization

      table_factor的语法和标准sql比较,后者只接受table_reference,每个逗号项都等于一个inner Join,e.g.

    SELECT * FROM t1 LEFT JOIN (t2, t3, t4)
                     ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)

    等于:

    SELECT * FROM t1 LEFT JOIN (t2 CROSS JOIN t3 CROSS JOIN t4)
                     ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)

       在mysql中,cross join 语法上等效与 inner join(他们可以互相替换),在标准sql中,他们并不等效,

       一般情况下,在只包含inner join操作的join表达式中,圆括号可以被忽略,但外连接时,省略圆括号会带来不一样的结果

    t1 LEFT JOIN (t2 LEFT JOIN t3 ON t2.b=t3.b OR t2.b IS NULL)
       ON t1.a=t2.a

      转变:

    (t1 LEFT JOIN t2 ON t1.a=t2.a) LEFT JOIN t3
        ON t2.b=t3.b OR t2.b IS NULL

       然而,这两个表达式并不等效,假设:

     1: 表t1包含 行(1),(2);
    2: 表t2包含行(1,101);
    3: 表t3包含行(101);


    在这种情况下,第一个表达式返回的结果集包括(1,1,101,101) ,(2,null,null,null),然而第二表达式返回(1,1,101,101),(2,null,null,101);

    mysql> select * from 
        ->    t1 left join 
        ->      (t2 left join t3 on t2.b = t3.b or t2.b is null )
        ->         on t1.a = t2.a;
    +------+------+------+------+
    | a    | a    | b    | b    |
    +------+------+------+------+
    |    1 |    1 |  101 |  101 |
    |    2 | NULL | NULL | NULL |
    +------+------+------+------+
    2 rows in set (0.02 sec)
    
    mysql> select * from
        ->    (t1 left join t2 on t1.a = t2.a)
        ->       left join t3
        ->           on t2.b = t3.b or t2.b is null;
    +------+------+------+------+
    | a    | a    | b    | b    |
    +------+------+------+------+
    |    1 |    1 |  101 |  101 |
    |    2 | NULL | NULL |  101 |
    +------+------+------+------+
    2 rows in set (0.00 sec)


      <============================================================================>

    t1 LEFT JOIN (t2, t3) ON t1.a=t2.a

     该表达式不能转变成:

    t1 LEFT JOIN t2 ON t1.a=t2.a, t3.
    Database changed
    mysql> select * from t1 left join (t2,t3) on t1.a = t2.a;
    +------+------+------+------+
    | a    | a    | b    | b    |
    +------+------+------+------+
    |    1 |    1 |  101 |  101 |
    |    2 | NULL | NULL | NULL |
    +------+------+------+------+
    2 rows in set (0.00 sec)
    
    mysql> select * from t1 left join t2 on t1.a = t2.a, t3;
    +------+------+------+------+
    | a    | a    | b    | b    |
    +------+------+------+------+
    |    1 |    1 |  101 |  101 |
    |    2 | NULL | NULL |  101 |
    +------+------+------+------+
    2 rows in set (0.00 sec)

    不同的结果集

        因此,如果我们去掉有outer join操作的join表达式的圆括号,我们会改变原始表达式的结果集。

        更加准确的,我们不能去掉左外连接的右括号操作,和右外连接的左括号操作。换句话,我们不能去掉外连接操作中的内表表达式的括号,外表的其他操作括号可以被去掉;

    (t1,t2) LEFT JOIN t3 ON P(t2.b,t3.b)

    等效于:

    t1, t2 LEFT JOIN t3 ON P(t2.b,t3.b)
     
    SELECT * FROM t1 LEFT JOIN (t2 LEFT JOIN t3 ON t2.b=t3.b) ON t1.a=t2.a
      WHERE t1.a > 1
    
    SELECT * FROM t1 LEFT JOIN (t2, t3) ON t1.a=t2.a
      WHERE (t2.b=t3.b OR t2.b IS NULL) AND t1.a > 1

    这梁查询都包含内连接:

    t2 LEFT JOIN t3 ON t2.b=t3.b
    t2, t3
    SELECT * FROM T1 INNER JOIN T2 ON P1(T1,T2)
                     INNER JOIN T3 ON P2(T2,T3)
      WHERE P(T1,T2,T3).

    该表达式的执行顺序:

    FOR each row t1 in T1 {
      FOR each row t2 in T2 such that P1(t1,t2) {
        FOR each row t3 in T3 such that P2(t2,t3) {
          IF P(t1,t2,t3) {
             t:=t1||t2||t3; OUTPUT t;
          }
        }
      }
    }
    SELECT * FROM T1 LEFT JOIN
                  (T2 LEFT JOIN T3 ON P2(T2,T3))
                  ON P1(T1,T2)
      WHERE P(T1,T2,T3).

    顺序:

    FOR each row t1 in T1 {
      BOOL f1:=FALSE;
      FOR each row t2 in T2 such that P1(t1,t2) {
        BOOL f2:=FALSE;
        FOR each row t3 in T3 such that P2(t2,t3) {
          IF P(t1,t2,t3) {
            t:=t1||t2||t3; OUTPUT t;
          }
          f2=TRUE;
          f1=TRUE;
        }
        IF (!f2) {
          IF P(t1,t2,NULL) {
            t:=t1||t2||NULL; OUTPUT t;
          }
          f1=TRUE;
        }
      }
      IF (!f1) {
        IF P(t1,NULL,NULL) {
          t:=t1||NULL||NULL; OUTPUT t;
        }
      }
    }

        

        在一般情况下,对于在一外联接操作的任何嵌套循环的第一内表,一个标志被引入在循环之前处于关闭并在循环之后被检查。该标志被打开时,用于从外部表的当前行从代表内操作数表中找到一个匹配的。如果在循环周期结束的标志仍然关闭,没有发现匹配的外部表的当前行。在这种情况下,该行被用于内表的列补充NULL值。结果行被传递到最后的检查为输出或到下一个嵌套循环,但仅当行满足所有嵌入式外的连接条件。

         对于外连接的查询,优化器可以只选择为循环表外循环在内部表之前的顺序。因此,我们外部连接的查询,只有一个嵌套顺序是可能的。对于下面的查询,优化器将评估两个不同的嵌套:

    SELECT * T1 LEFT JOIN (T2,T3) ON P1(T1,T2) AND P2(T1,T3)
      WHERE P(T1,T2,T3)
    FOR each row t1 in T1 {
      BOOL f1:=FALSE;
      FOR each row t2 in T2 such that P1(t1,t2) {
        FOR each row t3 in T3 such that P2(t1,t3) {
          IF P(t1,t2,t3) {
            t:=t1||t2||t3; OUTPUT t;
          }
          f1:=TRUE
        }
      }
      IF (!f1) {
        IF P(t1,NULL,NULL) {
          t:=t1||NULL||NULL; OUTPUT t;
        }
      }
    }

    和:

    FOR each row t1 in T1 {
      BOOL f1:=FALSE;
      FOR each row t3 in T3 such that P2(t1,t3) {
        FOR each row t2 in T2 such that P1(t1,t2) {
          IF P(t1,t2,t3) {
            t:=t1||t2||t3; OUTPUT t;
          }
          f1:=TRUE
        }
      }
      IF (!f1) {
        IF P(t1,NULL,NULL) {
          t:=t1||NULL||NULL; OUTPUT t;
        }
      }
    }

    如果P可以拆分:

    P(T1,T2,T2) = C1(T1) AND C2(T2) AND C3(T3).
    

    算法会变成

    FOR each row t1 in T1 such that C1(t1) {
      FOR each row t2 in T2 such that P1(t1,t2) AND C2(t2)  {
        FOR each row t3 in T3 such that P2(t2,t3) AND C3(t3) {
          IF P(t1,t2,t3) {
             t:=t1||t2||t3; OUTPUT t;
          }
        }
      }
    }

         每个合取C1(T1)的,C 2(T2),C 3(T3)被压出的最内环到最外循环在那里可以进行评估。如果C1(T 1)是一个非常严格的条件下,此条件下推可能大大从传递给内表T1减少行数。其结果是,在执行时间的查询可能提高极大。

  • 相关阅读:
    简明Python3教程 12.问题解决
    简明Python3教程 11.数据结构
    【SPOJ 694】Distinct Substrings
    【codeforces Manthan, Codefest 17 C】Helga Hufflepuff's Cup
    【CF Manthan, Codefest 17 B】Marvolo Gaunt's Ring
    【CF Manthan, Codefest 17 A】Tom Riddle's Diary
    【SPOJ 220】 PHRASES
    【POJ 3261】Milk Patterns
    【POJ 3294】Life Forms
    【POJ 1226】Substrings
  • 原文地址:https://www.cnblogs.com/onlysun/p/4525570.html
Copyright © 2011-2022 走看看