zoukankan      html  css  js  c++  java
  • MySQL的JOIN(四):JOIN优化实践之快速匹配

    这篇博文讲述如何优化扫描速度。我们通过MySQL的JOIN(二):JOIN原理得知了两张表的JOIN操作就是不断从驱动表中取出记录,然后查找出被驱动表中与之匹配的记录并连接。这个过程的实质就是查询操作,想要优化查询操作,建索引是最常用的方式。那索引怎么建呢?我们来讨论下,首先插入测试数据。

        CREATE TABLE t1 (
            id INT PRIMARY KEY AUTO_INCREMENT,
            type INT
        );
        SELECT COUNT(*) FROM t1;
        +----------+
        | COUNT(*) |
        +----------+
        |   110000 |
        +----------+
        CREATE TABLE t2 (
            id INT PRIMARY KEY AUTO_INCREMENT,
            type INT
        );
        SELECT COUNT(*) FROM t2;
        +----------+
        | COUNT(*) |
        +----------+
        |      100 |
        +----------+

    左连接

    左连接中,左表是驱动表,右表是被驱动表。想要快速查找被驱动表中匹配的记录,所以我们可以在右表建索引,从而提高连接性能。

        -- 首先两个表都没建索引
        EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON t1.type=t2.type;
        +----+-------+------+------+--------+----------------------------------------------------+
        | id | table | type | key  | rows   | Extra                                              |
        +----+-------+------+------+--------+----------------------------------------------------+
        |  1 | t1    | ALL  | NULL | 110428 | NULL                                               |
        |  1 | t2    | ALL  | NULL |    100 | Using where; Using join buffer (Block Nested Loop) |
        +----+-------+------+------+--------+----------------------------------------------------+
    -- 尝试在左表建立索引,改进不大 CREATE INDEX idx_type ON t1(type); EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON t1.type=t2.type; +----+-------+-------+----------+--------+----------------------------------------------------+ | id | table | type | key | rows | Extra | +----+-------+-------+----------+--------+----------------------------------------------------+ | 1 | t1 | index | idx_type | 110428 | Using index | | 1 | t2 | ALL | NULL | 100 | Using where; Using join buffer (Block Nested Loop) | +----+-------+-------+----------+--------+----------------------------------------------------+

    -- 尝试在右表建立索引,效果拔群,Using index!!! DROP INDEX idx_type ON t1; CREATE INDEX idx_type ON t2(type); EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON t1.type=t2.type; +----+-------+------+---------------+----------+--------+-------------+ | id | table | type | possible_keys | key | rows | Extra | +----+-------+------+---------------+----------+--------+-------------+ | 1 | t1 | ALL | NULL | NULL | 110428 | NULL | | 1 | t2 | ref | idx_type | idx_type | 1 | Using index | +----+-------+------+---------------+----------+--------+-------------+

    右连接

    右连接中,右表是驱动表,左表是被驱动表,想要快速查找被驱动表中匹配的记录,所以我们可以在左表建索引,从而提高连接性能。

        DROP INDEX idx_type ON t2;
        -- 两个表都没有索引
        EXPLAIN SELECT * FROM t1 RIGHT JOIN t2 ON t1.type=t2.type;
        +----+-------+------+------+--------+----------------------------------------------------+
        | id | table | type | key  | rows   | Extra                                              |
        +----+-------+------+------+--------+----------------------------------------------------+
        |  1 | t2    | ALL  | NULL |    100 | NULL                                               |
        |  1 | t1    | ALL  | NULL | 110428 | Using where; Using join buffer (Block Nested Loop) |
        +----+-------+------+------+--------+----------------------------------------------------+

    -- 在右边建立索引,改进不大 CREATE INDEX idx_type ON t2(type); EXPLAIN SELECT * FROM t1 RIGHT JOIN t2 ON t1.type=t2.type; +----+-------+-------+---------------+----------+--------+----------------------------------------------------+ | id | table | type | possible_keys | key | rows | Extra | +----+-------+-------+---------------+----------+--------+----------------------------------------------------+ | 1 | t2 | index | NULL | idx_type | 100 | Using index | | 1 | t1 | ALL | NULL | NULL | 110428 | Using where; Using join buffer (Block Nested Loop) | +----+-------+-------+---------------+----------+--------+----------------------------------------------------+

    -- 尝试在左边建立索引,效果拔群! DROP INDEX idx_type ON t2; CREATE INDEX idx_type ON t1(type); EXPLAIN SELECT * FROM t1 RIGHT JOIN t2 ON t1.type=t2.type; +----+-------+------+---------------+--------------+------+-------------+ | id | table | type | possible_keys | ref | rows | Extra | +----+-------+------+---------------+--------------+------+-------------+ | 1 | t2 | ALL | NULL | NULL | 100 | NULL | | 1 | t1 | ref | idx_type | test.t2.type | 5 | Using index | +----+-------+------+---------------+--------------+------+-------------+

    内连接

    我们知道,MySQL Optimizer会对内连接做优化,不管谁内连接谁,都是用小表驱动大表,所以如果要优化内连接,可以在大表上建立索引,以提高连接性能。

    另外注意一点,在小表上建立索引时,MySQL Optimizer会认为用大表驱动小表效率更快,转而用大表驱动小表。

    对内连接小表驱动大表的优化策略不清楚的话,可以看MySQL的JOIN(三):JOIN优化实践之内循环的次数

        DROP INDEX idx_type ON t1;
        -- 两个表都没有索引,t2驱动t1
        EXPLAIN SELECT * FROM t1 INNER JOIN t2 ON t1.type=t2.type;
        +----+-------+------+------+--------+----------------------------------------------------+
        | id | table | type | key  | rows   | Extra                                              |
        +----+-------+------+------+--------+----------------------------------------------------+
        |  1 | t2    | ALL  | NULL |    100 | NULL                                               |
        |  1 | t1    | ALL  | NULL | 110428 | Using where; Using join buffer (Block Nested Loop) |
        +----+-------+------+------+--------+----------------------------------------------------+
    -- 在t2表上建立索引,MySQL的Optimizer发现后,用大表驱动了小表 CREATE INDEX idx_type ON t2(type); EXPLAIN SELECT * FROM t1 INNER JOIN t2 ON t1.type=t2.type; +----+-------+------+----------+--------+-------------+ | id | table | type | key | rows | Extra | +----+-------+------+----------+--------+-------------+ | 1 | t1 | ALL | NULL | 110428 | Using where | | 1 | t2 | ref | idx_type | 1 | Using index | +----+-------+------+----------+--------+-------------+

    -- 在t1表上建立索引,再加上t1是大表,符合“小表驱动大表”的原则,性能比上面的语句要好 DROP INDEX idx_type ON t2; CREATE INDEX idx_type ON t1(type); EXPLAIN SELECT * FROM t1 INNER JOIN t2 ON t1.type=t2.type; +----+-------+------+---------------+----------+------+-------------+ | id | table | type | possible_keys | key | rows | Extra | +----+-------+------+---------------+----------+------+-------------+ | 1 | t2 | ALL | NULL | NULL | 100 | Using where | | 1 | t1 | ref | idx_type | idx_type | 5 | Using index | +----+-------+------+---------------+----------+------+-------------+

    三表连接

    上面都是两表连接,三表连接也是一样的,找出驱动表和被驱动表,在被驱动表上建立索引,即可提高连接性能。

    总结

    想要从快速匹配的角度优化JOIN,首先就是找出谁是驱动表,谁是被驱动表,然后在被驱动表上建立索引即可。

  • 相关阅读:
    定时清理日志的shell脚本
    图解 Elestricsearch 写入流程
    消息队列产生严重消息堆积怎么处理?
    消息队列如何确保消息的有序性?
    架构模式 CQRS
    消息队列把消息弄丢了怎么办?
    3 个主流 Java 微服务框架
    RabbitMQ、Kafka、RocketMQ 是如何实现高可用的?
    Kafka 不再需要 ZooKeeper
    微服务设计原则
  • 原文地址:https://www.cnblogs.com/fudashi/p/7521915.html
Copyright © 2011-2022 走看看