开发人员给了一个sql ,结构如下delete from B where ID in (select NID from H where guid='xxx');
内部sql满足条件的结果集只有一条,但是整个删除操作执行了将近1分钟,如果是将结果集放在括号里或者将in改为= ,执行的速度可以实现毫秒级别
但是如果内部查询结果集多于一行,采用第一种方案的话需要更改程序,后来又试了一种更改为join,速度也是极快。
测试表,t1.id上有索引,t2.id无索引
mysql> select * from t1; mysql> select * from t2;
+------+------+----------+ +------+---------+
| id | name | class_id | | id | name |
+------+------+----------+ +------+---------+
| 1 | aa | NULL | | 2 | myname2 |
| 2 | aa | NULL | | 6 | myname5 |
| 3 | dd | NULL | +------+---------+
| 6 | cc | NULL | 2 rows in set (0.01 sec)
+------+------+----------+
4 rows in set (0.00 sec)
使用子查询及改为join后的执行计划
mysql> explain delete from t1 where id in (select id from t2 where name='aa'); +----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | DELETE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where | +----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ 2 rows in set (0.00 sec) mysql> explain delete t1.* from t1 inner join t2 where t1.id=t2.id and t2.name='aa'; +----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+ | 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where | | 1 | DELETE | t1 | NULL | ref | idx_id | idx_id | 5 | const | 1 | 100.00 | Using where | +----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+ 2 rows in set (0.01 sec)
对于子查询的执行计划可以看出先对t1进行全表扫描,然后执行select id from t2 where name='aa' and t1.id=t2.id ,如果有值则删除t.* where id=t1.id
而对于改为join的sql来说,优化器会很智能的选取小表来作为驱动表,然后再走索引删除t1.* , 而对于子查询官方文档解释为由外向内执行
为了更加直观的看两种方式的执行过程,打开回话级别的profiling
mysql> show profiles; +----------+------------+------------------------------------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+------------------------------------------------------------------------------+ | 3 | 0.00137075 | delete from t1 where id in (select id from t2 where name='aa') | | 4 | 0.00211725 | explain delete t1.* from t1 inner join t2 where t1.id=t2.id and t2.name='aa' | | 5 | 0.00132050 | delete t1.* from t1 inner join t2 where t1.id=t2.id and t2.name='aa' | +----------+------------+------------------------------------------------------------------------------+ mysql> show profile for query 3 mysql> show profile for query 5 -> ; -> ; +----------------------+----------+ +--------------------------------+----------+ | Status | Duration | | Status | Duration | +----------------------+----------+ +--------------------------------+----------+ | starting | 0.000388 | | starting | 0.000360 | | checking permissions | 0.000026 | | checking permissions | 0.000013 | | checking permissions | 0.000008 | | checking permissions | 0.000007 | | Opening tables | 0.000105 | | checking permissions | 0.000004 | | init | 0.000152 | | init | 0.000005 | | System lock | 0.000083 | | Opening tables | 0.000048 | | updating | 0.000084 | | init | 0.000048 | | optimizing | 0.000031 | | deleting from main table | 0.000022 | | statistics | 0.000083 | | System lock | 0.000028 | | preparing | 0.000052 | | optimizing | 0.000043 | | executing | 0.000013 | | statistics | 0.000144 | | Sending data | 0.000114 | | preparing | 0.000144 | | executing | 0.000009 | | executing | 0.000009 | | Sending data | 0.000017 | | Sending data | 0.000246 | | executing | 0.000005 | | deleting from reference tables | 0.000073 | | Sending data | 0.000019 | | end | 0.000012 | | executing | 0.000006 | | end | 0.000010 | | Sending data | 0.000018 | | query end | 0.000016 | | end | 0.000019 | | closing tables | 0.000015 | | query end | 0.000020 | | freeing items | 0.000037 | | closing tables | 0.000021 | | cleaning up | 0.000039 | | freeing items | 0.000054 | +--------------------------------+----------+ | cleaning up | 0.000046 | 21 rows in set, 1 warning (0.00 sec) +----------------------+----------+ 23 rows in set, 1 warning (0.01 sec)
我第一眼关注的是两条语句senting data的次数,子查询对应的sending data是4次,子查询先对外部表进行全表扫描,结果集是4行,然后进行循环遍历拿出每一行与内部查询进行关联,共执行了4次内部查询,并且每次都对内部查询的结果集做一下判断是否有值,如果有值则再进行删除
小小的记录一下,在优化器的探索之路上慢慢爬