公用表达式(Common Table Expressions,简称CTE)
Maria DB 版本为10.2.2以上的才支持 WITH 语法
CTE 介绍
WITH关键字表示公用表表达式(CTE)
它使您可以在查询中多次引用子查询表达式,就好像有一个仅在查询期间存在的临时表一样。
语法(Syntax)
WITH [RECURSIVE] table_reference as (SELECT ...)
SELECT ...
您可以将table_reference用作外部SELECT部分中的任何普通表。您也可以在子查询中使用WITH。WITH也可以与EXPLAIN和SELECT一起使用。
以下是在顶级使用WITH的示例:
WITH t AS (SELECT a FROM t1 WHERE b >= 'c')
SELECT * FROM t2, t WHERE t2.c = t.a;
下面的示例在子查询中使用WITH:
SELECT t1.a, t1.b FROM t1, t2
WHERE t1.a > t2.c
AND t2.c IN(WITH t AS (SELECT * FROM t1 WHERE t1.a < 5)
SELECT t2.c FROM t2, t WHERE t2.c = t.a);
以下是递归CTE的示例:
WITH RECURSIVE ancestors AS
( SELECT * FROM folks
WHERE name="Alex"
UNION
SELECT f.*
FROM folks AS f, ancestors AS a
WHERE f.id = a.father OR f.id = a.mother )
SELECT * FROM ancestors;
公用表表达式(CTE)是标准的SQL功能,本质上是临时命名的结果集
CTE有两种类别:
- 非递归CTE
- 递归CTE
非递归CTE
WITH关键字表示CTE,它具有一个名称,后跟一个正文(主要查询的语句),如下所示
CTE与派生表相似。例如:
WITH engineers AS
( SELECT * FROM employees
WHERE dept = 'Engineering' )
SELECT * FROM engineers
WHERE ...
顶级使用CTE
SELECT * FROM
( SELECT * FROM employees
WHERE dept = 'Engineering' ) AS engineers
WHERE ...
子查询中使用CTE
基本上,非递归CTE是本地查询的VIEW。有许多优点和警告。该语法比嵌套的FROM(SELECT ...)更具可读性。一个CTE可以引用另一个CTE,并且可以从多个地方引用它。
引用另一个CTE的CTE
与嵌套的FROM(SELECT ...)子句相比,使用这种格式可使SQL更具可读性。下面是一个示例:
WITH engineers AS (
SELECT * FROM employees
WHERE dept IN('Development','Support') ),
eu_engineers AS ( SELECT * FROM engineers WHERE country IN('NL',...) )
SELECT
...
FROM eu_engineers;
CTE的多种用途
这可以是“anti-self join”,例如:
WITH engineers AS (
SELECT * FROM employees
WHERE dept IN('Development','Support') )
SELECT * FROM engineers E1
WHERE NOT EXISTS
(SELECT 1 FROM engineers E2
WHERE E2.country=E1.country
AND E2.name <> E1.name );
或者,对于逐年比较,例如:
WITH sales_product_year AS (
SELECT product, YEAR(ship_date) AS year,
SUM(price) AS total_amt
FROM item_sales
GROUP BY product, year )
SELECT *
FROM sales_product_year CUR,
sales_product_year PREV,
WHERE CUR.product=PREV.product
AND CUR.year=PREV.year + 1
AND CUR.total_amt > PREV.total_amt
另一个用途是将个人与他们的团体进行比较。以下是如何执行此操作的示例:
WITH sales_product_year AS (
SELECT product,
YEAR(ship_date) AS year,
SUM(price) AS total_amt
FROM item_sales
GROUP BY product, year
)
SELECT *
FROM sales_product_year S1
WHERE
total_amt >
(SELECT 0.1 * SUM(total_amt)
FROM sales_product_year S2
WHERE S2.year = S1.year)
递归CTE表达式
公用表表达式(CTE)是标准的SQL功能,本质上是临时命名的结果集。CTE最初于1999年出现在SQL标准中,而第一个实现则于2007年开始出现。
SQL通常在递归结构方面很差。
CTE允许查询引用自身。递归CTE将重复执行数据的子集,直到获得完整的结果集。这对于处理分层或树状数据特别有用。max_recursive_iterations 避免了无限循环。
语法示例
WITH RECURSIVE 表示递归CTE,它具有一个名称,后跟一个正文(主要查询),如下所示:
计算方式(Computation)
给出以下结构:
首先执行查询的锚点部分:
接下来,执行查询的递归部分:
Summary so far(到目前为止的总结)
with recursive R as (
select anchor_data
union [all]
select recursive_part
from R, ...
)
select ...
- 计算anchor_data
- 计算recursive_part以获取新数据
- 如果(新数据为非空)转到2;
CAST避免截断数据
正如MariaDB和SQL标准当前实现的那样,如果数据转换不正确,数据可能会被截断。如果CTE的递归部分为列生成的值比CTE的非递归部分宽,则必须将列CAST正确的宽度。
Examples
传递闭包-确定总线目的地
样本数据:
CREATE TABLE bus_routes (origin varchar(50), dst varchar(50));
INSERT INTO bus_routes VALUES
('New York', 'Boston'),
('Boston', 'New York'),
('New York', 'Washington'),
('Washington', 'Boston'),
('Washington', 'Raleigh');
现在,我们要返回以纽约(New York)为起点的巴士目的地:
WITH RECURSIVE bus_dst as (
SELECT origin as dst FROM bus_routes WHERE origin='New York'
UNION
SELECT bus_routes.dst FROM bus_routes, bus_dst WHERE bus_dst.dst= bus_routes.origin
)
SELECT * FROM bus_dst;
+------------+
| dst |
+------------+
| New York |
| Boston |
| Washington |
| Raleigh |
+------------+
上面的示例计算如下:
首先,计算anchor 数据:
- 从纽约(New York)出发
- 添加了波士顿(Boston)和华盛顿(Washington)
接下来,递归部分:
- 从波士顿(Boston)出发,然后从华盛顿(Washington)出发
- 罗利(Raleigh)被添加
- UNION排除已经存在的节点。
计算路径-确定总线(Bus)路线
这次,我们尝试获取诸如“纽约->华盛顿->罗利”之类的巴士路线。
New York -> Washington -> Raleigh
使用与上一个示例相同的样本数据:
WITH RECURSIVE paths (cur_path, cur_dest) AS (
SELECT origin, origin FROM bus_routes WHERE origin='New York'
UNION
SELECT CONCAT(paths.cur_path, ',', bus_routes.dst), bus_routes.dst
FROM paths, bus_routes
WHERE paths.cur_dest = bus_routes.origin AND
NOT FIND_IN_SET(bus_routes.dst, paths.cur_path)
)
SELECT * FROM paths;
+-----------------------------+------------+
| cur_path | cur_dest |
+-----------------------------+------------+
| New York | New York |
| New York,Boston | Boston |
| New York,Washington | Washington |
| New York,Washington,Boston | Boston |
| New York,Washington,Raleigh | Raleigh |
+-----------------------------+------------+
CAST避免数据被截断
在下面的示例中,数据被截断,因为结果没有明确地转换为足够宽的类型:
WITH RECURSIVE tbl AS (
SELECT NULL AS col
UNION
SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)
+------+
| col |
+------+
| NULL |
| |
+------+
明确使用CAST来克服此问题:
WITH RECURSIVE tbl AS (
SELECT CAST(NULL AS CHAR(50)) AS col
UNION SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)
SELECT * FROM tbl;
+---------------------+
| col |
+---------------------+
| NULL |
| THIS NEVER SHOWS UP |
+---------------------+