问题重现
比如我有两个表
mysql> select * from user; +----+------+-----+ | id | name | age | +----+------+-----+ | 1 | A | 9 | | 2 | B | 11 | | 3 | C | 15 | | 4 | D | 13 | +----+------+-----+ 4 rows in set (0.03 sec) mysql> select * from blacklist; +----+------+------+ | id | name | type | +----+------+------+ | 1 | B | 1 | | 2 | C | 0 | | 3 | D | 1 | +----+------+------+ 3 rows in set (0.03 sec)
我要从里面选取黑名单并且年龄大于10的。可是我手误,把age条件加在了in查询里,可是黑名单没有age字段。
select * from user where name in (select name from blacklist where type = 1 and age > 10);
单独执行子查询是有很明显的问题的:
mysql> select name from blacklist where type = 1 and age > 10; 1054 - Unknown column 'age' in 'where clause'
可是我如果执行整个查询:
mysql> select * from user where name in (select name from blacklist where type = 1 and age > 10); +----+------+-----+ | id | name | age | +----+------+-----+ | 2 | B | 11 | | 4 | D | 13 | +----+------+-----+ 2 rows in set (0.02 sec)
没问题!并且还是正确的结果!!
为什么?
我们explain一下
mysql> explain select * from user where name in (select name from blacklist where type = 1 and age > 10); +----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------------------------+ | 1 | SIMPLE | blacklist | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where; Start temporary | | 1 | SIMPLE | user | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 25.00 | Using where; End temporary; Using join buffer (Block Nested Loop) | +----+-------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------------------------+ 2 rows in set (0.02 sec)
没看出什么端倪,继续执行
mysql> show warnings; +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1276 | Field or reference 'mytest.user.age' of SELECT #2 was resolved in SELECT #1 | | Note | 1003 | /* select#1 */ select `mytest`.`user`.`id` AS `id`,`mytest`.`user`.`name` AS `name`,`mytest`.`user`.`age` AS `age` from `mytest`.`user` semi join (`mytest`.`blacklist`) where ((`mytest`.`user`.`name` = `mytest`.`blacklist`.`name`) and (`mytest`.`blacklist`.`type` = 1) and (`mytest`.`user`.`age` > 10)) | +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2 rows in set (0.03 sec)
这里能看见数据库优化之后的语句
/* select#1 */ SELECT `mytest`.`user`.`id` AS `id`, `mytest`.`user`.`name` AS `name`, `mytest`.`user`.`age` AS `age` FROM `mytest`.`user` semi JOIN ( `mytest`.`blacklist` ) WHERE ( (`mytest`.`user`.`name` = `mytest`.`blacklist`.`name` ) AND ( `mytest`.`blacklist`.`type` = 1 ) AND ( `mytest`.`user`.`age` > 10 ) )
可以看出,数据库把原来的查询改成了join查询,每个字段都指向了正确的表。
解决
当然是把语句改正呗,这属于手误造成的。
mysql> select * from user where name in (select name from blacklist where type = 1) and age > 10; +----+------+-----+ | id | name | age | +----+------+-----+ | 2 | B | 11 | | 4 | D | 13 | +----+------+-----+ 2 rows in set (0.03 sec)
但是其次,我们看到in里面如果有语法错误,是没有暴露的。我们把in改成exists试一下:
mysql> select * from user where EXISTS (select name from blacklist where type = 1 and age > 10); +----+------+-----+ | id | name | age | +----+------+-----+ | 2 | B | 11 | | 3 | C | 15 | | 4 | D | 13 | +----+------+-----+ 3 rows in set (0.05 sec)
结果不对,并且EXISTS返回的是一个布尔值,只要其中的子查询返回行数,where条件即成立。所以,select name 换成select 1也是成立的。
mysql> select * from user where EXISTS (select 1 from blacklist where type = 1 and age > 10); +----+------+-----+ | id | name | age | +----+------+-----+ | 2 | B | 11 | | 3 | C | 15 | | 4 | D | 13 | +----+------+-----+ 3 rows in set (0.04 sec)
如果使EXISTS不成立,则
mysql> select * from user where EXISTS (select name from blacklist where type = 11 and age > 10); Empty set
explain一下
mysql> explain select * from user where EXISTS (select name from blacklist where type = 1 and age > 10); +----+--------------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | PRIMARY | user | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | blacklist | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where | +----+--------------------+-----------+------------+------+---------------+------+---------+------+------+----------+-------------+ 2 rows in set (0.06 sec)
查看warnings
mysql> show warnings; +-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1276 | Field or reference 'mytest.user.age' of SELECT #2 was resolved in SELECT #1 | | Note | 1003 | /* select#1 */ select `mytest`.`user`.`id` AS `id`,`mytest`.`user`.`name` AS `name`,`mytest`.`user`.`age` AS `age` from `mytest`.`user` where exists(/* select#2 */ select `mytest`.`blacklist`.`name` from `mytest`.`blacklist` where ((`mytest`.`blacklist`.`type` = 1) and (`mytest`.`user`.`age` > 10))) | +-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2 rows in set (0.06 sec)
优化之后的语句
/* select#1 */ SELECT `mytest`.`user`.`id` AS `id`, `mytest`.`user`.`name` AS `name`, `mytest`.`user`.`age` AS `age` FROM `mytest`.`user` WHERE EXISTS ( /* select#2 */ SELECT `mytest`.`blacklist`.`name` FROM `mytest`.`blacklist` WHERE (( `mytest`.`blacklist`.`type` = 1 ) AND ( `mytest`.`user`.`age` > 10 )))
可以看到,使用了exists,同样优化掉了低级错误。如果想要得到我们预期的数据,需要在exists子查询添加一个条件
mysql> select * from user a where EXISTS (select 1 from blacklist where type = 1 and name = a.name and age > 10); +----+------+-----+ | id | name | age | +----+------+-----+ | 2 | B | 11 | | 4 | D | 13 | +----+------+-----+ 2 rows in set (0.06 sec)
exists可以看做一个循环
List<User> result = new ArrayList<>(); List<User> user = new ArrayList<>(); for (int i = 0; i < user.size(); i++){ if (exists(black.type == 1 && black.name.equals(user.get(i).name)) && user.get(i).age > 10){ result.add(user.get(i)); } }
不是很严谨,但是大概是这么个意思
IN和Exists的区别
Exists:先执行外部查询语句,然后在执行子查询,子查询中它每次都会去执行数据库的查询,执行次数等于外查询的数据数量。
In:先查询 in()子查询的数据(1次),并且将数据放进内存里(不需要多次查询),然后外部查询的表再根据查询的结果进行查询过滤,最后返回结果。
In 是把外表和内表作hash 连接,而exists是对外表作loop循环,每次loop循环再对内表进行查询。
参考:https://blog.csdn.net/qq_27409289/article/details/85963089