zoukankan      html  css  js  c++  java
  • mysql和oracle 关于多表join的区别

    http://stackoverflow.com/questions/10953143/join-performance-oracle-vs-mysql

    翻译自上面的链接。

    Given a query reduced to the form:

    select b.field1
    from table_a a
        innerjoin table_b b on b.field1 = a.field1
        leftjoin table_c c on c.field1 = a.field1
        leftjoin table_d d on d.field1 = b.field1
        leftjoin table_e e on e.field1 = b.field6
    groupby b.field1,
             b.field2,
             b.field3,
             b.field4,
             b.field5,
    
             e.field2,
             e.field3
    ;

    With a certain amount of data it is running in 20 seconds in Oracle. Nothing is indexed in Oracle. Migrated into MySQL the query does not want to finish (executes in minutes). Every field in question is indexed in MySQL. Explain tells that everything is fine.

    After still not working, the grouping fields got multiple-column indexes. Still nothing.

    What can be the problem that there is still a huge leak in the MySQL performance? Is there a method to speed it up?

    他是说这这样的多表join ,oracle 什么索引都没建,20s出结果了,但是mysql ,索引都建了,几分钟还没出结果来。

    我们这直接看5楼的回复。

    Oracle is able to do hash joins and merge joins, MySQL is not.

    oracle 做了hash join ,而mysql 没有。

    Since your tables are not filtered in any way, hash joins would be the most efficient way to do the joins, especially if you don't have any indexes.

    如果表没用过滤数据,就是说没用where 子句的,用hash join 是非常高效的连接方式。

    With nested loops, even if all join fields are indexed, MySQL needs to do an index seek on each value from the leading table in a loop (each time starting from the root index page), then do the table lookup to retrieve the record, then repeat it for each joined table. This involves lots of random seeks.

    nested loops ,其实就是嵌套循环算法的话,这里不大好翻译。我的理解是每个row都要和外层的做连接,相当于要一个笛卡尔积的连接量,多层的话,数据量会指数次增加。而且涉及了很多随机读。

    A hash join, on the other side, requires scanning the smaller table once (building a hash table) then scanning the bigger table once (searching the hash table built). This involves sequential scans which are much faster.

     而hash join 呢,需要扫描一张小的表,做一个hash table ,用他来和大表进行扫描,里面做的是顺序扫描,因此速度会快的多。

    Also, with nested loops, a left-joined table can only be driven (scanned in the inner loop), while with a hash join tables on either side can be leading (scanned) or driven (hashed then searched). This affects performance too.

    嵌套扫描的话,只有左连接可以被驱动。hash join的话,两边都能被驱动。

    MySQL's optimizer, though does support a couple of handy tricks which other engines lack, has very limited capabilities compared to other engines and currently supports neither hash joins nor merge joins. Thus said, a query like this would most probably be slow on MySQL, even if it's fast on other engines on the same data.

    mysql 优化器不支持此类优化,hash join 和merge joins ,所以和用什么引擎无关,都会很慢。

     此文之前写过。但是没保存,以至于后面丢失了。我以为会进草稿箱,但是遗憾的是没有。吃一陷长一智,下次要养成保存的好习惯。

  • 相关阅读:
    C++ 编写strcpy函数
    JavaScript抽象类及Class.create备忘
    读:<测试一下你解决问题的逻辑思维及算法能力>后
    JavaScript AJAX类
    MOSS ad组的获取及Hashtable作缓存总结
    Js获取元素位置及动态生成元素的练习备忘
    NET许可证及License
    Javascript获取元素位置及其它
    hdu 149850 years, 50 colors 最大匹配
    poj 2513 Colored Sticks 字典树
  • 原文地址:https://www.cnblogs.com/gqdw/p/3166944.html
Copyright © 2011-2022 走看看