oracle 的自连接(self join)操作
对于oracle数据库来说,目前支持两套语法,一套是oracle自己的sql语法,一套是通行标准的SQL99语法,那么对于oracle的连接操作
来说,也完全可以使用这样的两套语法来分别的实现。当然从效率上来说,两者是没有差别的。只不过从我的角度来讲,oracle的语法更加简洁而已。
比如说我们有一张表emp,表里数据如下
scott@DB01> conn scott/tiger
Connected.
scott@DB01>
set linesize 120
scott@DB01> set pagesize 100
scott@DB01> select * from emp;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- ------------ ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10
14 rows selected.
通过观察发现,在表中数据库有从属关系,比如7369 smith的经理是7902,7902也是一个员工,名称是ford,他的经理是7566 jones,以此类推,我们最终可以找到公司的boss 7839 king。
现在我的需求是,找到公司里的上下级对应关系,我们分别用oracle的语法和sql99的语法来实现,我们可以得到同样的结果
oracle语法:我们注意到from后面 表emp出现两次,并且使用了不同的别名,也就是表自己和自己来关联,所以我们称这样的连接为自连接操作。当然了,以下的例子中使用了外连接操作,我以后会单独的说明。
scott@DB01> select w.ename||' report for ' ||m.ename "Relations"
2 from emp w,emp m
3 where w.empno=m.empno(+);
Relations
--------------------------------
SMITH report for SMITH
ALLEN report for ALLEN
WARD report for WARD
JONES report for JONES
MARTIN report for MARTIN
BLAKE report for BLAKE
CLARK report for CLARK
SCOTT report for SCOTT
KING report for KING
TURNER report for TURNER
ADAMS report for ADAMS
JAMES report for JAMES
FORD report for FORD
MILLER report for MILLER
14 rows selected.
sql99语法:join on 语法的实现
scott@DB01> select w.ename||' report for ' ||m.ename "Relations"
2 from emp w left outer join emp m
3 on (w.mgr=m.empno);
Relations
--------------------------------
SMITH report for FORD
ALLEN report for BLAKE
WARD report for BLAKE
JONES report for KING
MARTIN report for BLAKE
BLAKE report for KING
CLARK report for KING
SCOTT report for JONES
KING report for
TURNER report for BLAKE
ADAMS report for SCOTT
JAMES report for BLAKE
FORD report for JONES
MILLER report for CLARK
14 rows selected.
其实,oracle对于处理表中有从属关系的记录,提供了一种查询方法,我们称之为层次查询,来看一个例子
scott@DB01> select ename ||' report for '|| prior ename
2 from emp
3 start with empno=7839 ---规定遍历从属关系树的起点
4 connect by prior empno=mgr; ----规定了遍历的方向
ENAME||'REPORTFOR'||PRIORENAME
--------------------------------
KING report for
JONES report for KING
SCOTT report for JONES
ADAMS report for SCOTT
FORD report for JONES
SMITH report for FORD
BLAKE report for KING
ALLEN report for BLAKE
WARD report for BLAKE
MARTIN report for BLAKE
TURNER report for BLAKE
JAMES report for BLAKE
CLARK report for KING
MILLER report for CLARK
14 rows selected.
为了使查询的结果更具有可读性,我们借助一个函数SYS_CONNECT_BY_PATH来做一下处理,来看结果:
scott@DB01> SELECT LPAD(' ', 2*level-1)||SYS_CONNECT_BY_PATH(ename, '/') "Path"
2 FROM emp
3 START WITH empno = 7839
4 CONNECT BY PRIOR empno=mgr;
Path
-------------------------------------------------------------------------------
/KING
/KING/JONES
/KING/JONES/SCOTT
/KING/JONES/SCOTT/ADAMS
/KING/JONES/FORD
/KING/JONES/FORD/SMITH
/KING/BLAKE
/KING/BLAKE/ALLEN
/KING/BLAKE/WARD
/KING/BLAKE/MARTIN
/KING/BLAKE/TURNER
/KING/BLAKE/JAMES
/KING/CLARK
/KING/CLARK/MILLER
14 rows selected.
通过上面的结果,你是否发现,从属关系更加清晰,也更加直观呢。
当然了对于我们来讲,好看的格式化的输出只是锦上添花,更重要的是语句的执行的速度,也就是性能的考虑
借助于autotrace工具,我们来看语句的执行计划
scott@DB01> set autotrace trace exp
scott@DB01>
scott@DB01> SELECT LPAD(' ', 2*level-1)||SYS_CONNECT_BY_PATH(ename, '/') "Path"
2 FROM emp
3 START WITH empno = 7839
4 CONNECT BY PRIOR empno=mgr;
Execution Plan
----------------------------------------------------------
Plan hash value: 3613731379
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 196 | 3 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| EMP | | | | |
|* 3 | INDEX UNIQUE SCAN | PK_EMP | 1 | 4 | 0 (0)| 00:00:01 |
|* 4 | HASH JOIN | | | | | |
| 5 | CONNECT BY PUMP | | | | | |
| 6 | TABLE ACCESS FULL | EMP | 14 | 196 | 3 (0)| 00:00:01 |
| 7 | TABLE ACCESS FULL | EMP | 14 | 196 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("EMPNO"=7839)
3 - access("EMPNO"=7839)
4 - access("MGR"=NULL)
scott@DB01> select w.ename||' report for ' ||m.ename "Relations"
2 from emp w,emp m
3 where w.empno=m.empno(+);
Execution Plan
----------------------------------------------------------
Plan hash value: 2199491010
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 280 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 14 | 280 | 4 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL | EMP | 14 | 140 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 10 | 1 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | PK_EMP | 1 | | 0 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("W"."EMPNO"="M"."EMPNO"(+))
通过上面的例子我们可以发现,层次查询的速度也要快于自连接操作,当然了,我们在这里的讨论,并没有考虑两种语句对其他资源的影响。
scott@DB01> set pagesize 100
scott@DB01> select * from emp;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- ------------ ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10
14 rows selected.
通过观察发现,在表中数据库有从属关系,比如7369 smith的经理是7902,7902也是一个员工,名称是ford,他的经理是7566 jones,以此类推,我们最终可以找到公司的boss 7839 king。
现在我的需求是,找到公司里的上下级对应关系,我们分别用oracle的语法和sql99的语法来实现,我们可以得到同样的结果
oracle语法:我们注意到from后面 表emp出现两次,并且使用了不同的别名,也就是表自己和自己来关联,所以我们称这样的连接为自连接操作。当然了,以下的例子中使用了外连接操作,我以后会单独的说明。
scott@DB01> select w.ename||' report for ' ||m.ename "Relations"
2 from emp w,emp m
3 where w.empno=m.empno(+);
Relations
--------------------------------
SMITH report for SMITH
ALLEN report for ALLEN
WARD report for WARD
JONES report for JONES
MARTIN report for MARTIN
BLAKE report for BLAKE
CLARK report for CLARK
SCOTT report for SCOTT
KING report for KING
TURNER report for TURNER
ADAMS report for ADAMS
JAMES report for JAMES
FORD report for FORD
MILLER report for MILLER
14 rows selected.
sql99语法:join on 语法的实现
scott@DB01> select w.ename||' report for ' ||m.ename "Relations"
2 from emp w left outer join emp m
3 on (w.mgr=m.empno);
Relations
--------------------------------
SMITH report for FORD
ALLEN report for BLAKE
WARD report for BLAKE
JONES report for KING
MARTIN report for BLAKE
BLAKE report for KING
CLARK report for KING
SCOTT report for JONES
KING report for
TURNER report for BLAKE
ADAMS report for SCOTT
JAMES report for BLAKE
FORD report for JONES
MILLER report for CLARK
14 rows selected.
其实,oracle对于处理表中有从属关系的记录,提供了一种查询方法,我们称之为层次查询,来看一个例子
scott@DB01> select ename ||' report for '|| prior ename
2 from emp
3 start with empno=7839 ---规定遍历从属关系树的起点
4 connect by prior empno=mgr; ----规定了遍历的方向
ENAME||'REPORTFOR'||PRIORENAME
--------------------------------
KING report for
JONES report for KING
SCOTT report for JONES
ADAMS report for SCOTT
FORD report for JONES
SMITH report for FORD
BLAKE report for KING
ALLEN report for BLAKE
WARD report for BLAKE
MARTIN report for BLAKE
TURNER report for BLAKE
JAMES report for BLAKE
CLARK report for KING
MILLER report for CLARK
14 rows selected.
为了使查询的结果更具有可读性,我们借助一个函数SYS_CONNECT_BY_PATH来做一下处理,来看结果:
scott@DB01> SELECT LPAD(' ', 2*level-1)||SYS_CONNECT_BY_PATH(ename, '/') "Path"
2 FROM emp
3 START WITH empno = 7839
4 CONNECT BY PRIOR empno=mgr;
Path
-------------------------------------------------------------------------------
/KING
/KING/JONES
/KING/JONES/SCOTT
/KING/JONES/SCOTT/ADAMS
/KING/JONES/FORD
/KING/JONES/FORD/SMITH
/KING/BLAKE
/KING/BLAKE/ALLEN
/KING/BLAKE/WARD
/KING/BLAKE/MARTIN
/KING/BLAKE/TURNER
/KING/BLAKE/JAMES
/KING/CLARK
/KING/CLARK/MILLER
14 rows selected.
通过上面的结果,你是否发现,从属关系更加清晰,也更加直观呢。
当然了对于我们来讲,好看的格式化的输出只是锦上添花,更重要的是语句的执行的速度,也就是性能的考虑
借助于autotrace工具,我们来看语句的执行计划
scott@DB01> set autotrace trace exp
scott@DB01>
scott@DB01> SELECT LPAD(' ', 2*level-1)||SYS_CONNECT_BY_PATH(ename, '/') "Path"
2 FROM emp
3 START WITH empno = 7839
4 CONNECT BY PRIOR empno=mgr;
Execution Plan
----------------------------------------------------------
Plan hash value: 3613731379
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 196 | 3 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| EMP | | | | |
|* 3 | INDEX UNIQUE SCAN | PK_EMP | 1 | 4 | 0 (0)| 00:00:01 |
|* 4 | HASH JOIN | | | | | |
| 5 | CONNECT BY PUMP | | | | | |
| 6 | TABLE ACCESS FULL | EMP | 14 | 196 | 3 (0)| 00:00:01 |
| 7 | TABLE ACCESS FULL | EMP | 14 | 196 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("EMPNO"=7839)
3 - access("EMPNO"=7839)
4 - access("MGR"=NULL)
scott@DB01> select w.ename||' report for ' ||m.ename "Relations"
2 from emp w,emp m
3 where w.empno=m.empno(+);
Execution Plan
----------------------------------------------------------
Plan hash value: 2199491010
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 280 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 14 | 280 | 4 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL | EMP | 14 | 140 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 10 | 1 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | PK_EMP | 1 | | 0 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("W"."EMPNO"="M"."EMPNO"(+))
通过上面的例子我们可以发现,层次查询的速度也要快于自连接操作,当然了,我们在这里的讨论,并没有考虑两种语句对其他资源的影响。