zoukankan      html  css  js  c++  java
  • MySQL优化 exists/in改写join

    问题语句 

    复制代码
    SELECT * FROM a 
    WHERE `type` = 'appointment' 
    AND `event` = 14 
    AND EXISTS ( 
    SELECT * FROM b 
    WHERE a.`sheet_id` = b.`id` 
    AND `company_id` = 8 
    AND b.`deleted_at` IS NULL ) 
    ORDER BY a.id DESC 
    LIMIT 6;
    复制代码

    解读执行计划

    在exists类型的子查询的执行计划中,select_type一栏分别是PRIMARY和DEPENDENT SUBQUERY

     
     

    DEPENDENT SUBQUERY的意思是:子查询,依赖于外层的查询;DEPEND SUBQUERY是依赖于SQL的主体部分,它的执行次数最大可能和SQL主体部分结果的行数一样多(这里因为有limit6,所以看起来主表扫描行数是6,如果去掉这个limit6的话,这个值是1500W)

    上面这句话解释得通俗一点就是外连接先执行查询,然后把查询的结果集放入子查询内进行匹配;外查询每执行一次查询,就要来子查询匹配一次

    join的执行计划中,select_type一栏都是simple

     
     

    join的第一行的就是外表

    从上面的对比可以看出,无论是 exists类型的子查询 还是join,都基本可以看作遵循了第一行就是驱动表的规则(注意不是所有子查询都遵循这个规则,本篇只针对exists类型的dependent subquery)


    分析

    图一PRIMARY对应的表就是图二中的a表,DEPENDENT SUBQUERY表就是图二中的b表;a表有1500W行数据,b表有2W行数据

    所以图一的SQL执行效率如此低下的原因就是大表驱动小表

    优化

    exists改写为join

    复制代码
    1 SELECT a.*
    2 FROM  a join b on a.`sheet_id` = b.`id`
    3 WHERE a.`type` = 'appointment'
    4 AND a.`event` = 14
    5 AND b.`company_id` = 8
    6 AND b.`deleted_at` IS NULL
    7 ORDER BY a.`id` DESC
    8 LIMIT 6;
    复制代码

    由于a表作为内表,因此在a.`sheet_id`,a.`type`,a.`event`上创建联合索引;语句中出现了b表的本地谓词,所以b表的b.`company_id`,b.`deleted_at`上也要创建联合索引

    优化结果,执行时间:117s→0.36s,性能提升了2000倍

    这个语句有一个更极端的取值,在b.`company_id` = 2的时候,小表不会搜出任何满足条件的结果,在这种情况下,原语句执行时间在350s以上,而新语句仅需要0.03s,性能提升万倍


    优化案例

    今天优化的这批语句中,大多数是exists子查询的问题,可以看出这个研发小哥非常的喜欢用exists这种写法;前面的那个exists语句是泛用型,后面的exists语句加了些新花样

    eg.

    复制代码
     1 SELECT SUM(`xxxx`) AS ag
     2 FROM a
     3 WHERE EXISTS (
     4 SELECT * FROM b
     5 WHERE a.`delivery_sheet_id` = b.`id`
     6 AND (`status` = 4
     7 OR `is_rejected` = '1')
     8 AND `company_id` = 8
     9 AND b.`deleted_at` IS NULL
    10 )
    11 AND `status` IN (0, 4)
    12 AND `collection_type` IN (2, 3)
    13 AND a.`deleted_at` IS NULL;
    复制代码

    or的优化通常改写union,但这里是求sum不能这么改,需要改写成2个语句然后求和;对应的列要建好索引

    复制代码
     1 select c.ag+d.ag as ag from
     2 (SELECT SUM(a.`xxxx`) AS ag
     3 FROM  a join b
     4 on a.`delivery_sheet_id` = b.`id`
     5 where
     6 b.`status` = 4
     7 AND    b.`company_id` = 8
     8 AND b.`deleted_at` IS NULL
     9 AND a.`status` IN (0, 4)
    10 AND a.`collection_type` IN (2, 3)
    11 AND a.`deleted_at` IS NULL) c,
    12 (
    13 SELECT SUM(a.`xxxx`) AS ag
    14 FROM a join b
    15 on a.`delivery_sheet_id` = b.`id`
    16 where
    17 b.`is_rejected` = '1'
    18 AND    b.`company_id` = 8
    19 AND b.`deleted_at` IS NULL
    20 AND a.`status` IN (0, 4)
    21 AND a.`collection_type` IN (2, 3)
    22 AND a.`deleted_at` IS NULL) d;
    复制代码

    优化结果,执行时间:18s→0.2s


    in改写join的思路和exists差不多

    这里没有现成的例子,粘贴一篇郑松华老师公众号的分析过来

    原语句

    复制代码
     1 SELECT
     2 
     3 COUNT( * ) AS totalNum,
     4 
     5 sum( CASE WHEN F.ALARM_LEVEL = 1 THEN 1 ELSE 0 END ) AS LEVELS1,
     6 
     7 sum( CASE WHEN F.ALARM_LEVEL = 2 THEN 1 ELSE 0 END ) AS LEVELS2,
     8 
     9 sum( CASE WHEN F.ALARM_LEVEL = 3 THEN 1 ELSE 0 END ) AS LEVELS3,
    10 
    11 sum( CASE WHEN F.DEAL_STATE = 0 THEN 1 ELSE 0 END ) AS DESTS
    12 
    13 FROM
    14 
    15 F
    16 
    17 LEFT JOIN  DC ON DC.ID = F.CONST_ID
    18 
    19 LEFT JOIN  V ON V.ID = F.VEHICLE_ID
    20 
    21 LEFT JOIN  AREA ON AREA.ID = V.SYS_DIVISION_ID
    22 
    23 WHERE
    24 
    25 DC.ID IS NOT NULL
    26 
    27 AND V.ID IS NOT NULL
    28 
    29 AND F.DEAL_STATE = 0
    30 
    31 AND ALARM_LEVEL IN ( 1, 2, 3 )
    32 
    33 AND F.VEHICLE_ID IN (
    34 
    35 SELECT
    36 
    37 VEHICLE_ID
    38 
    39 FROM
    40 
    41 GVLK
    42 
    43 WHERE
    44 
    45 GROUP_ID IN ( SELECT GROUP_ID FROM GULK WHERE USER_ID = 'ff8080816091b09c0161f9b825750a9a' )
    46 
    47 UNION
    48 
    49 SELECT
    50 
    51 VEHICLE_ID
    52 
    53 FROM
    54 
    55 UVLK
    56 
    57 WHERE
    58 
    59 USER_ID = 'ff8080816091b09c0161f9b825750a9a'
    60 
    61 )
    62 
    63 AND date( F.ALARM_TIME ) BETWEEN '2000-01-01'
    64 
    65 AND '2018-08-14'
    66 
    67 AND AREA.PATH LIKE CONCAT( ( SELECT ARE.PATH FROM  ARE WHERE ARE.ID = '0' ), '%' )
    复制代码

    执行计划如下

     
     

    改写如下(in改join)

    复制代码
     1 explain extended
     2 
     3 SELECT
     4 
     5 COUNT( * ) AS totalNum,
     6 
     7 sum( CASE WHEN F.ALARM_LEVEL = 1 THEN 1 ELSE 0 END ) AS LEVELS1,
     8 
     9 sum( CASE WHEN F.ALARM_LEVEL = 2 THEN 1 ELSE 0 END ) AS LEVELS2,
    10 
    11 sum( CASE WHEN F.ALARM_LEVEL = 3 THEN 1 ELSE 0 END ) AS LEVELS3,
    12 
    13 sum( CASE WHEN F.DEAL_STATE = 0 THEN 1 ELSE 0 END ) AS DESTS
    14 
    15 FROM
    16 
    17 F
    18 
    19 straight_join (
    20 
    21 SELECT
    22 
    23 VEHICLE_ID
    24 
    25 FROM
    26 
    27 GVLK
    28 
    29 WHERE
    30 
    31 GROUP_ID IN ( SELECT GROUP_ID FROM GULK WHERE USER_ID = 'ff8080816091b09c0161f9b825750a9a' )
    32 
    33 UNION
    34 
    35 SELECT
    36 
    37 VEHICLE_ID
    38 
    39 FROM
    40 
    41 UVLK
    42 
    43 WHERE
    44 
    45 USER_ID = 'ff8080816091b09c0161f9b825750a9a'
    46 
    47 ) s on F.VEHICLE_ID = s.VEHICLE_ID
    48 
    49 straight_join DC ON DC.ID = F.CONST_ID
    50 
    51 straight_join V ON V.ID = F.VEHICLE_ID
    52 
    53 straight_join AREA ON AREA.ID = V.SYS_DIVISION_ID
    54 
    55 WHERE
    56 
    57 DC.ID IS NOT NULL
    58 
    59 AND V.ID IS NOT NULL
    60 
    61 AND F.DEAL_STATE = 0
    62 
    63 AND ALARM_LEVEL IN ( 1, 2, 3 )  
    64 
    65 AND date( F.ALARM_TIME ) BETWEEN '2000-01-01'
    66 
    67 AND '2018-08-14'
    68 
    69 AND AREA.PATH LIKE CONCAT( ( SELECT ARE.PATH FROM ARE WHERE ARE.ID = '0' ), '%' )
    复制代码

    觉得本文有用,请转发、点赞或点击“在看”聚焦技术与人文,分享干货,共同成长更多内容请关注“数据与人”



  • 相关阅读:
    MyBatis执行sql的整个流程
    Ftp传输:向linux服务器上传文件时“550 Permission denied.”错误问题解决
    SpringBoot框架:两个方法同时调用时父方法使内部方法的DataSource注解失效的解决办法
    SpringBoot框架:通过AOP和自定义注解完成druid连接池的动态数据源切换(三)
    SpringBoot框架:配置文件application.properties和application.yml的区别
    SpringBoot框架:'url' attribute is not specified and no embedded datasource could be configured问题处理
    bash脚本打印字符串一个空格的内容
    gethostbyname的线程安全
    算法工程师的职业规划
    理解Deep Link & URI Schemes & Universal Link & App Link
  • 原文地址:https://www.cnblogs.com/shujuyr/p/13080917.html
Copyright © 2011-2022 走看看