join是mysql中一个基础的关键词,一般在多表连接查询中使用,这里做一下总结
1、JOIN的语法格式
table_references: table_reference [, table_reference] ... table_reference: table_factor | join_table table_factor: tbl_name [[AS] alias] [{USE|IGNORE|FORCE} INDEX (key_list)] | ( table_references ) | { OJ table_reference LEFT OUTER JOIN table_reference ON conditional_expr } join_table: table_reference [INNER | CROSS] JOIN table_factor [join_condition] | table_reference STRAIGHT_JOIN table_factor | table_reference STRAIGHT_JOIN table_factor ON condition | table_reference LEFT [OUTER] JOIN table_reference join_condition | table_reference NATURAL [LEFT [OUTER]] JOIN table_factor | table_reference RIGHT [OUTER] JOIN table_reference join_condition | table_reference NATURAL [RIGHT [OUTER]] JOIN table_factor join_condition: ON conditional_expr | USING (column_list)
2、JOIN解析说明
我们先准备实验例子(mysql 版本:mysql Ver 14.12 Distrib 5.0.95, for redhat-linux-gnu (i686) using readline 5.1)
CREATE TABLE `test_join_a` ( `id` bigint(20) NOT NULL DEFAULT '0', `name` varchar(200), PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE `test_join_b` ( `id` bigint(20) NOT NULL DEFAULT '0', `sex` varchar(200), PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE `test_join_c` ( `id` bigint(20) NOT NULL DEFAULT '0', `class` varchar(200), PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; insert into test_join_a (id) values (1); insert into test_join_a (id, name) values (2, "abc"); insert into test_join_a (id, name) values (3, "abcd"); insert into test_join_a (id, name) values (4, "abcde"); insert into test_join_b (id) values (1); insert into test_join_b (id, sex) values (2, "abc"); insert into test_join_b (id, sex) values (3, "abc"); insert into test_join_c (id) values (1); insert into test_join_c (id, class) values (2, "abc"); insert into test_join_c (id, class) values (3, "abc");
a、关于LEFT JOIN
对于 A LEFT JOIN (B, C) ON join_condition ,A表m,B表n, C表t 条记录 【说明:由于未发现专业术语描述A,B,C三表区别,此处,将A表暂称基础表,B,C表称附加表】
* 以A表为基础表,从B表和C表中挑选符合ON join_condition条件的数据,如果没有符合的,则附加表(B,C)的列对应的行设置为NULL。
* 通过这种连接方式,查询后的数据条数,最多为 m * n * t条, 最少为 m条
同理,对于(A, B) LEFT JOIN C ON join_condition
* 以A, B表为基础表,从C表中挑选符合ON join_condition条件的数据,如果没有符合的,则附加表(C)的列对应的行设置为NULL。
* 通过这种连接方式,查询后的数据条数,最多为 m * n * t条, 最少为 m * n条
在LEFT JOIN 和 RIGHT JOIN 中,ON 子句后还有 WHERE 子句的筛选条件,区别在于
1、where子句可以省略,ON 子句不能省略,如果不用ON join_condition,则可以使用ON 1=1 或者 ON 1
2、ON join_condtion 会以基础表的笛卡尔积为基础,附加表没有符合ON join_condtion的记录,则将附加表列对应的行设为NULL。
3、where 子句会将 ON join_condition生成的虚拟表做筛选。所以通过where子句后,获取的记录数,最少为 0条
注意: NULL=NULL 既不符合 join的on condition 也不符合 where子句的筛选
下面是两个实验过程,a表4条记录,b表3条,c表3条:
1、可以看到生成记录条数4 * 3 * 3 = 36 条
mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) on 1=1; +-----+-------+------+------+------+-------+ | aid | name | bid | sex | cid | class | +-----+-------+------+------+------+-------+ | 1 | NULL | 1 | NULL | 1 | NULL | | 1 | NULL | 1 | NULL | 2 | abc | | 1 | NULL | 1 | NULL | 3 | abc | | 1 | NULL | 2 | abc | 1 | NULL | | 1 | NULL | 2 | abc | 2 | abc | | 1 | NULL | 2 | abc | 3 | abc | | 1 | NULL | 3 | abc | 1 | NULL | | 1 | NULL | 3 | abc | 2 | abc | | 1 | NULL | 3 | abc | 3 | abc | | 2 | abc | 1 | NULL | 1 | NULL | | 2 | abc | 1 | NULL | 2 | abc | | 2 | abc | 1 | NULL | 3 | abc | | 2 | abc | 2 | abc | 1 | NULL | | 2 | abc | 2 | abc | 2 | abc | | 2 | abc | 2 | abc | 3 | abc | | 2 | abc | 3 | abc | 1 | NULL | | 2 | abc | 3 | abc | 2 | abc | | 2 | abc | 3 | abc | 3 | abc | | 3 | abcd | 1 | NULL | 1 | NULL | | 3 | abcd | 1 | NULL | 2 | abc | | 3 | abcd | 1 | NULL | 3 | abc | | 3 | abcd | 2 | abc | 1 | NULL | | 3 | abcd | 2 | abc | 2 | abc | | 3 | abcd | 2 | abc | 3 | abc | | 3 | abcd | 3 | abc | 1 | NULL | | 3 | abcd | 3 | abc | 2 | abc | | 3 | abcd | 3 | abc | 3 | abc | | 4 | abcde | 1 | NULL | 1 | NULL | | 4 | abcde | 1 | NULL | 2 | abc | | 4 | abcde | 1 | NULL | 3 | abc | | 4 | abcde | 2 | abc | 1 | NULL | | 4 | abcde | 2 | abc | 2 | abc | | 4 | abcde | 2 | abc | 3 | abc | | 4 | abcde | 3 | abc | 1 | NULL | | 4 | abcde | 3 | abc | 2 | abc | | 4 | abcde | 3 | abc | 3 | abc | +-----+-------+------+------+------+-------+ 36 rows in set (0.00 sec)
2、筛选符合条件的记录,如果没有符合的,则设置为NULL,参看实验表中aid=1,3, 4情况。
mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) on
a.name=b.sex; +-----+-------+------+------+------+-------+ | aid | name | bid | sex | cid | class | +-----+-------+------+------+------+-------+ | 1 | NULL | NULL | NULL | NULL | NULL | | 2 | abc | 2 | abc | 1 | NULL | | 2 | abc | 2 | abc | 2 | abc | | 2 | abc | 2 | abc | 3 | abc | | 2 | abc | 3 | abc | 1 | NULL | | 2 | abc | 3 | abc | 2 | abc | | 2 | abc | 3 | abc | 3 | abc | | 3 | abcd | NULL | NULL | NULL | NULL | | 4 | abcde | NULL | NULL | NULL | NULL | +-----+-------+------+------+------+-------+ 9 rows in set (0.00 sec)
3、针对 (A, B) LEFT JOIN C ON join_condition 的情况,筛选符合条件的记录,以A,B表的笛卡尔积为基础,查找C中符合 on join_condition的记录数,如果没有,则将C中对应列的行值设置未NULL。
mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a, test_join_b b) left join test_join_c c on b.id>5; +-----+-------+-----+------+------+-------+ | aid | name | bid | sex | cid | class | +-----+-------+-----+------+------+-------+ | 1 | NULL | 1 | NULL | NULL | NULL | | 1 | NULL | 2 | abc | NULL | NULL | | 1 | NULL | 3 | abc | NULL | NULL | | 2 | abc | 1 | NULL | NULL | NULL | | 2 | abc | 2 | abc | NULL | NULL | | 2 | abc | 3 | abc | NULL | NULL | | 3 | abcd | 1 | NULL | NULL | NULL | | 3 | abcd | 2 | abc | NULL | NULL | | 3 | abcd | 3 | abc | NULL | NULL | | 4 | abcde | 1 | NULL | NULL | NULL | | 4 | abcde | 2 | abc | NULL | NULL | | 4 | abcde | 3 | abc | NULL | NULL | +-----+-------+-----+------+------+-------+ 12 rows in set (0.00 sec)
4、可以看出ON condition的执行在WHERE之前,ON condition会形成虚拟表。
mysql> explain select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) on b.id>5 where b.id>5; +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------------+ | 1 | SIMPLE | b | range | PRIMARY | PRIMARY | 8 | NULL | 1 | Using where | | 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 3 | Using join buffer | | 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 4 | Using join buffer | +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------------+ 3 rows in set (0.00 sec)
5、WHERE 对虚拟表的筛选和ON条件是不一样的,ON condition会保存基础表的每条数据,附加表中没有符合的,则将附加表列设置为NULL,WHERE 只是筛选符合WHERE 子句的记录。 (可以从下面的第三次查询的结果看出,查询的是通过ON condition形成的虚拟表)
mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) on b.id>5; +-----+-------+------+------+------+-------+ | aid | name | bid | sex | cid | class | +-----+-------+------+------+------+-------+ | 1 | NULL | NULL | NULL | NULL | NULL | | 2 | abc | NULL | NULL | NULL | NULL | | 3 | abcd | NULL | NULL | NULL | NULL | | 4 | abcde | NULL | NULL | NULL | NULL | +-----+-------+------+------+------+-------+ 4 rows in set (0.00 sec) mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) on b.id>5 where b.id>5; Empty set (0.00 sec)
mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) ON a.name=b.sex where b.sex is NULL;
+-----+-------+------+------+------+-------+
| aid | name | bid | sex | cid | class |
+-----+-------+------+------+------+-------+
| 1 | NULL | NULL | NULL | NULL | NULL |
| 3 | abcd | NULL | NULL | NULL | NULL |
| 4 | abcde | NULL | NULL | NULL | NULL |
+-----+-------+------+------+------+-------+
6、NULL=NULL 既不符合 join的on condition 也不符合 where子句的筛选。
mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) ON a.name=b.sex; +-----+-------+------+------+------+-------+ | aid | name | bid | sex | cid | class | +-----+-------+------+------+------+-------+ | 1 | NULL | NULL | NULL | NULL | NULL | | 2 | abc | 2 | abc | 1 | NULL | | 2 | abc | 2 | abc | 2 | abc | | 2 | abc | 2 | abc | 3 | abc | | 2 | abc | 3 | abc | 1 | NULL | | 2 | abc | 3 | abc | 2 | abc | | 2 | abc | 3 | abc | 3 | abc | | 3 | abcd | NULL | NULL | NULL | NULL | | 4 | abcde | NULL | NULL | NULL | NULL | +-----+-------+------+------+------+-------+ 9 rows in set (0.00 sec) mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from test_join_a a left join (test_join_b b, test_join_c c) ON a.name=b.sex where a.name=b.sex; +-----+------+------+------+------+-------+ | aid | name | bid | sex | cid | class | +-----+------+------+------+------+-------+ | 2 | abc | 2 | abc | 1 | NULL | | 2 | abc | 3 | abc | 1 | NULL | | 2 | abc | 2 | abc | 2 | abc | | 2 | abc | 3 | abc | 2 | abc | | 2 | abc | 2 | abc | 3 | abc | | 2 | abc | 3 | abc | 3 | abc | +-----+------+------+------+------+-------+ 6 rows in set (0.00 sec)
b、关于RIGHT JOIN
right join和 left join类似,区别在于
1、对于 A right join (B, C) on join_condition ,会以 B,C表为基础表,A表为附加表
2、对于 (B, C) right join A on join_condition ,会以 A 表为基础表,B,C表为附加表
C、关于JOIN
JOIN与LEFT JOIN 和RIGHT JOIN的区别在于 JOIN 不区别左右基础表,可以认为他的ON condition 退化成where 子句形式,
例如:对于下面的SQL
select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON a.name=b.sex;
结果上等价于
select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON 1=1 where a.name=b.sex;
同时对于 A JOIN (B, C) 等价于 (A,B) JOIN C
关于JOIN的解析过程,我们可以理解成, 将A,B,C表做笛卡尔积,构成新的虚拟表, ON condition转化成where 子句,结合后面的where 子句做筛选,得出筛选结果。
我们来看一下实验情况:
mysql> explain select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON a.name=b.sex; +----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+ | 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 3 | | | 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 3 | Using join buffer | | 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 4 | Using where; Using join buffer | +----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+ 3 rows in set (0.00 sec) mysql> explain select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON 1=1 where a.name=b.sex; +----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+ | 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 3 | | | 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 3 | Using join buffer | | 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 4 | Using where; Using join buffer | +----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+ 3 rows in set (0.00 sec)
可以看出,两个sql的explain结果是一样的,拥有相同的执行计划
我们再看看这两个sql的执行结果
mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON a.name=b.sex; +-----+------+-----+------+-----+-------+ | aid | name | bid | sex | cid | class | +-----+------+-----+------+-----+-------+ | 2 | abc | 2 | abc | 1 | NULL | | 2 | abc | 3 | abc | 1 | NULL | | 2 | abc | 2 | abc | 2 | abc | | 2 | abc | 3 | abc | 2 | abc | | 2 | abc | 2 | abc | 3 | abc | | 2 | abc | 3 | abc | 3 | abc | +-----+------+-----+------+-----+-------+ 6 rows in set (0.00 sec) mysql> select a.id as aid, a.name, b.id as bid, b.sex, c.id as cid, c.class from (test_join_a a ,test_join_b b) join test_join_c c ON 1=1 where a.name=b.sex; +-----+------+-----+------+-----+-------+ | aid | name | bid | sex | cid | class | +-----+------+-----+------+-----+-------+ | 2 | abc | 2 | abc | 1 | NULL | | 2 | abc | 3 | abc | 1 | NULL | | 2 | abc | 2 | abc | 2 | abc | | 2 | abc | 3 | abc | 2 | abc | | 2 | abc | 2 | abc | 3 | abc | | 2 | abc | 3 | abc | 3 | abc | +-----+------+-----+------+-----+-------+ 6 rows in set (0.00 sec)