order by
默认情况下,MySQL对GROUP BY col1,col2,...查询进行排序,就好像您在查询中还包含了ORDER BY col1,col2,...一样。如果您包含一个包含相同列列表的显式ORDER BY子句,则MySQL会对其进行优化,而不会造成任何速度损失,尽管排序仍然会发生。如果查询包含GROUP BY,但您希望避免对结果进行排序的开销,则可以通过指定ORDER BY NULL来抑制排序。例如:
INSERT INTO foo
SELECT a, COUNT(*) FROM bar GROUP BY a ORDER BY NULL;
优化器可能仍选择使用排序来实现分组操作。 ORDER BY NULL禁止对结果排序,而不是通过分组操作确定结果的先前排序。
Mysql 在执行排序操作的时候有以下三种方案:
- sort By index
- file sort
- priority queue
sort By Index
因为B+tree 的特性,在叶子节点会存储索引 对应的列数据 和 对应的主键列数据,以双向列表的形式顺序存储。所以在某些情况下,MySQL可以使用索引来满足ORDER BY
子句,并避免执行文件排序操作时涉及的额外排序。
mysql> desc orderIndex;
+-------+------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+------+------+-----+---------+----------------+
| id | int | NO | PRI | NULL | auto_increment |
| a | int | YES | MUL | NULL | |
| b | int | YES | MUL | NULL | |
| c | int | YES | | NULL | |
+-------+------+------+-----+---------+----------------+
4 rows in set (0.00 sec)
mysql> show index from orderIndex;
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| orderIndex | 0 | PRIMARY | 1 | id | A | 4878 | NULL | NULL | | BTREE | | | YES | NULL |
| orderIndex | 1 | b | 1 | b | A | 4878 | NULL | NULL | YES | BTREE | | | YES | NULL |
| orderIndex | 1 | a_2 | 1 | a | A | 5000 | NULL | NULL | YES | BTREE | | | YES | NULL |
| orderIndex | 1 | a_2 | 2 | b | A | 5000 | NULL | NULL | YES | BTREE | | | YES | NULL |
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
4 rows in set (0.00 sec)
如下例所示:
mysql> explain select b from orderIndex order by a limit 10;
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | orderIndex | NULL | index | NULL | a_2 | 10 | NULL | 10 | 100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
extra中仅仅显示为 Using Index
表明此查询无需去主键索引查询表数据,仅仅使用二级索引即可满足查询语句(即索引覆盖
),也无需执行额外排序,下例也是同理:
mysql> explain select b from orderIndex order by a DESC limit 10;
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+----------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+----------------------------------+
| 1 | SIMPLE | orderIndex | NULL | index | NULL | a_2 | 10 | NULL | 10 | 100.00 | Backward index scan; Using index |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+----------------------------------+
1 row in set, 1 warning (0.00 sec)
有些查询语句及时无法达到索引覆盖
的效果,但仍然可以依赖 index 避免执行额外的 排序 操作。即使ORDER BY与索引不完全匹配,也可以使用索引,只要索引的所有未使用部分和所有额外的ORDER BY列在WHERE子句中都是常量即可。如下例所示:
mysql> explain select * from orderIndex where a=100 order by b;
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+-------+
| 1 | SIMPLE | orderIndex | NULL | ref | a_2 | a_2 | 5 | const | 1 | 100.00 | NULL |
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
explain select * from orderIndex where a=100 order by b DESC;
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+---------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+---------------------+
| 1 | SIMPLE | orderIndex | NULL | ref | a_2 | a_2 | 5 | const | 1 | 100.00 | Backward index scan |
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+---------------------+
1 row in set, 1 warning (0.00 sec)
如果索引不包含查询访问的所有列,则仅当索引访问比其他访问方法便宜时才使用索引。
# 索引包含所有查询列,没有使用额外排序
mysql> explain select a,b from orderIndex order by a, b;
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | orderIndex | NULL | index | NULL | a_2 | 10 | NULL | 5000 | 100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
# 索引不包含所有查询列,使用了额外排序
mysql> explain select * from orderIndex order by a, b;
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+----------------+
| 1 | SIMPLE | orderIndex | NULL | ALL | NULL | NULL | NULL | NULL | 5000 | 100.00 | Using filesort |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)
但有时一些查询语句单单依赖index是无法满足 排序语句的, 必须要进行额外排序:
# 满足索引覆盖条件
mysql> explain select a,b from orderIndex order by b;
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------+
| 1 | SIMPLE | orderIndex | NULL | index | NULL | a_2 | 10 | NULL | 5000 | 100.00 | Using index; Using filesort |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------+
1 row in set, 1 warning (0.00 sec)
注意,以上示例有在不同情况下执行结果可能会发生变动,比如 引擎认为全表扫描然后排序的开销 比依赖索引然后回表查询的 开销更低的情况下 则会使用额外的排序操作。
file sort
如上文所述,当order BY index 无法满足的时候,必须要通过额外的排序手段才能完成排序任务。mysql使用 file sort
方法来实现额外排序,需要我们注意的是 file sort
并不意味一定要创建临时文件,mysql优化器会为file sort
的执行开辟一个块内存缓存,它的预置大小是由 sort_buffer_size
参数控制的,开发者可以根据需求和场景去调整,各个会话可以根据需要更改此变量的会话值,以避免过多的内存使用,或根据需要分配更多的内存。当数据缓存超出了sort_buffer
上限时就会转存储到多个临时文件中,这和mysql一贯的设计思想是一致的:如果内存足够就放在内存,如果内存不够再开辟磁盘文件存储,尽量减少IO开销。
现在就会分为两种情况:
- 排序数据小于 sort_buffer_size
- 排序数据大于 sort_buffer_size
这两种不同情况下采用的排序策略是不一样的,第一种情况,直接在内存中进行排序,此时使用的排序算法要根据具体的语句去分析;第二种情况下数据会被划分为多个文件,在每个文件内部保证数据的有序(写文件前进行排序),然后再用归并排序
对各个文件数据进行总排序;下面做一个简单的验证:
mysql> set optimizer_trace='enabled=on';
mysql> select count(*) from orderIndex;
+----------+
| count(*) |
+----------+
| 5000 |
+----------+
1 row in set (0.02 sec)
mysql> show variables like 'sort_buffer_size';
+------------------+--------+
| Variable_name | Value |
+------------------+--------+
| sort_buffer_size | 262144 |
+------------------+--------+
1 row in set (0.00 sec)
filesort_priority_queue
此时在表orderIndex
中包含5000条数据,sort_buffer_size = 262144
,下面根据c
列进行全表排序,因为c
没有索引,所以肯定会触发file sort
:
mysql> select * from orderIndex order by c;
...
5000 rows in set (0.01 sec)
mysql> select * from information_schema.OPTIMIZER_TRACEG;
{
...
"join_execution": {
"select#": 1,
"steps": [
{
"sorting_table": "orderIndex",
"filesort_information": [
{
"direction": "asc",
"expression": "`orderIndex`.`c`"
}
],
"filesort_priority_queue_optimization": {
"usable": false,
"cause": "not applicable (no LIMIT)"
},
"filesort_execution": [
],
"filesort_summary": {
"memory_available": 262144,
"key_size": 9,
"row_size": 26,
"max_rows_per_buffer": 7710,
"num_rows_estimate": 11234,
"num_rows_found": 5000,
"num_initial_chunks_spilled_to_disk": 0,
"peak_memory_used": 221184,
"sort_algorithm": "std::stable_sort",
"unpacked_addon_fields": "skip_heuristic",
"sort_mode": "<fixed_sort_key, additional_fields>"
}
}
]
}
...
}
我们通过optimizer_trace
可以看出row_size=26
,整个表的数据无法填满sort_buffer
,可以直接在内存之中使用快速排序算法进行排序。而filesort_priority_queue_optimization
项显示并未开启优先队列排序,原因是没有使用Limit
,我们现在再加上Limit
进行查询:
mysql> select * from orderIndex order by c limit 5000;
... ...
5000 rows in set (0.00 sec)
mysql> select * from information_schema.OPTIMIZER_TRACEG;
{
... ...
"join_execution": {
"select#": 1,
"steps": [
{
"sorting_table": "orderIndex",
"filesort_information": [
{
"direction": "asc",
"expression": "`orderIndex`.`c`"
}
],
"filesort_priority_queue_optimization": {
"limit": 5000,
"chosen": true
},
"filesort_execution": [
],
"filesort_summary": {
"memory_available": 262144,
"key_size": 9,
"row_size": 26,
"max_rows_per_buffer": 5001,
"num_rows_estimate": 11234,
"num_rows_found": 5000,
"num_initial_chunks_spilled_to_disk": 0,
"peak_memory_used": 170034,
"sort_algorithm": "std::stable_sort",
"unpacked_addon_fields": "using_priority_queue",
"sort_mode": "<fixed_sort_key, additional_fields>"
}
}
]
}
... ...
}
可以看出现在是开启了filesort_priority_queue_optimization
,也就是说只有使用Limit
时会触发优先队列排序
优化,优先队列排序
是使用堆排序算法实现的,它执行的流程如下:
- 扫描表,将选择的每一列中的选择列表列按顺序插入队列。如果队列已满,请按排序顺序移出最后一行。
- 返回队列的前N行。 (如果指定了offset,请跳过前offset行,然后返回后N行。)
如果是DESC
排序则使用大堆法,否则使用小堆法。
Optimization Using filesort
现在调整sort_buffer_size
来模拟排序数据量大于sort_buffer_size
的情况:
mysql> set sort_buffer_size=10240;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> show variables like 'sort_buffer_size';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| sort_buffer_size | 32768 |
+------------------+-------+
1 row in set (0.00 sec)
我们设置了 10240,但小于sort_buffer_size
的最小值 32768,所以默认设置为了最小值。
mysql> select * from orderIndex order by c;
... ...
5000 rows in set (0.00 sec)
mysql> select * from information_schema.OPTIMIZER_TRACEG;
{
...
"join_execution": {
"select#": 1,
"steps": [
{
"sorting_table": "orderIndex",
"filesort_information": [
{
"direction": "asc",
"expression": "`orderIndex`.`c`"
}
],
"filesort_priority_queue_optimization": {
"usable": false,
"cause": "not applicable (no LIMIT)"
},
"filesort_execution": [
],
"filesort_summary": {
"memory_available": 32768,
"key_size": 9,
"row_size": 26,
"max_rows_per_buffer": 1260,
"num_rows_estimate": 11234,
"num_rows_found": 5000,
"num_initial_chunks_spilled_to_disk": 6,
"peak_memory_used": 33256,
"sort_algorithm": "std::stable_sort",
"unpacked_addon_fields": "skip_heuristic",
"sort_mode": "<fixed_sort_key, additional_fields>"
}
}
]
}
...
}
我们此时可以清晰的看到num_initial_chunks_spilled_to_disk = 6
(num_initial_chunks_spilled_to_disk
表示在执行合并之前块的数量),说明是使用了临时文件进行内存外排序的,它的执行流程如下:
- 读取与WHERE子句匹配的行。
- 对于每一行,在排序缓冲区中存储一个元组,该元组由排序键值和查询引用的列组成。
- 当排序缓冲区已满时,按内存中的排序键值对元组进行排序,并将其写入临时文件。
- 对临时文件进行归并排序后,按排序顺序检索行,但直接从排序后的元组中读取查询所需的列,而不是第二次访问该表。
此时我们需要额外说明的是,上面的流程是优化后的
算法,是在5.6 version后面引进来的,原始的算法
流程如下:
- 根据键或通过表扫描读取所有行。跳过与WHERE子句不匹配的行。
- 对于每一行,在排序缓冲区中存储一个由一对值(排序键值和行ID)组成的元组。
- 如果所有对都适合排序缓冲区,则不会创建临时文件。否则,当排序缓冲区已满时,请在内存中对其进行快速排序,然后将其写入临时文件。保存一个指向已排序块的指针。
- 重复上述步骤,直到已读取所有行。
- 将多达MERGEBUFF(7)个区合并到另一个临时文件中的一个块。重复直到第一个文件中的所有块都在第二个文件中。
- 重复以下步骤,直到剩余的块少于MERGEBUFF2(15)。
- 在最后一个多重合并中,仅行ID(值对的最后一部分)被写入结果文件。
- 使用结果文件中的行ID按排序顺序读取行。要对此进行优化,请读取大块的行ID,对它们进行排序,然后使用它们按排序顺序将行读取到行缓冲区中。行缓冲区大小是
read_rnd_buffer_size
系统变量值。此步骤的代码在sql / records.cc源文件中。
max_length_for_sort_data
系统变量会决定是使用 原始的算法还是优化后的算法,我们现在模拟一下原始的算法:
mysql> show variables like 'max_length_for_sort_data';
+--------------------------+-------+
| Variable_name | Value |
+--------------------------+-------+
| max_length_for_sort_data | 4096 |
+--------------------------+-------+
1 row in set (0.01 sec)
mysql> set max_length_for_sort_data=16;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> show variables like 'max_length_for_sort_data';
+--------------------------+-------+
| Variable_name | Value |
+--------------------------+-------+
| max_length_for_sort_data | 16 |
+--------------------------+-------+
1 row in set (0.00 sec)
将 max_length_for_sort_data
设置为 16,row_size=26
>max_length_for_sort_data
,此时会触发原始算法流程,但遗憾的是无法展示,因为在 8.0.20
版本后此变量已经被舍弃了:
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.20 |
+-----------+
1 row in set (0.00 sec)
此时再使用Limit [M], N
进行查询,如果M + N
过大也会出现大于 sort_buffer_size
的情况,需要进行内存外排序:
mysql> select * from orderIndex order by c limit 2000, 20;
mysql> select * from information_schema.OPTIMIZER_TRACEG;
{
... ...
"join_execution": {
"select#": 1,
"steps": [
{
"sorting_table": "orderIndex",
"filesort_information": [
{
"direction": "asc",
"expression": "`orderIndex`.`c`"
}
],
"filesort_priority_queue_optimization": {
"limit": 2020
},
"filesort_execution": [
],
"filesort_summary": {
"memory_available": 32768,
"key_size": 9,
"row_size": 26,
"max_rows_per_buffer": 1260,
"num_rows_estimate": 11234,
"num_rows_found": 5000,
"num_initial_chunks_spilled_to_disk": 6,
"peak_memory_used": 33256,
"sort_algorithm": "std::stable_sort",
"unpacked_addon_fields": "skip_heuristic",
"sort_mode": "<fixed_sort_key, additional_fields>"
}
}
]
}
... ...
}
其执行流程大致如下(存疑):
- 遍历整张表,重复这些步骤:(1)选择行,直到填充了排序缓冲区,(2)将缓冲区中的前N行(如果指定了M,则为M + N行)写入合并文件。
- 对合并文件进行排序,并返回前N行。 (如果指定了M,请跳过前M行,然后返回后N行。)