zoukankan      html  css  js  c++  java
  • 1226关于count(*)不走主键索引反而走二级索引

    转自 http://www.2cto.com/database/201508/433975.html

    mysqlcount(*)会选哪个索引?
    2015-08-19      0 个评论    来源:Database、Code  
    收藏    我要投稿
     

    今天在查询一个表行数的时候,发现count(1)和count(*)执行效率居然是一样的。这跟Oracle还是有区别的。遂查看两种方式的执行计划:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    mysql> select count(1) from customer;
    +----------+
    | count(1) |
    +----------+
    |   150000 |
    +----------+
    1 row in set (0.03 sec)
     
    mysql> flush tables;
    Query OK, 0 rows affected (0.00 sec)
     
    mysql> select count(*) from customer;
    +----------+
    | count(*) |
    +----------+
    |   150000 |
    +----------+
    1 row in set (0.03 sec)

    查看执行计划:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    mysql> explain select count(1) from customer;
    +----+-------------+----------+-------+---------------+---------------+---------+------+--------+-------------+
    | id | select_type | table    | type  | possible_keys | key           | key_len | ref  | rows   | Extra       |
    +----+-------------+----------+-------+---------------+---------------+---------+------+--------+-------------+
    |  1 | SIMPLE      | customer | index | NULL          | i_c_nationkey | 5       | NULL | 151191 | Using index |
    +----+-------------+----------+-------+---------------+---------------+---------+------+--------+-------------+
    1 row in set (0.00 sec)
     
    mysql> explain select count(*) from customer;
    +----+-------------+----------+-------+---------------+---------------+---------+------+--------+-------------+
    | id | select_type | table    | type  | possible_keys | key           | key_len | ref  | rows   | Extra       |
    +----+-------------+----------+-------+---------------+---------------+---------+------+--------+-------------+
    |  1 | SIMPLE      | customer | index | NULL          | i_c_nationkey | 5       | NULL | 151191 | Using index |
    +----+-------------+----------+-------+---------------+---------------+---------+------+--------+-------------+
    1 row in set (0.00 sec)
     
    mysql> show index from customer;
    +----------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | Table    | Non_unique | Key_name      | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
    +----------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | customer |          0 | PRIMARY       |            1 | c_custkey   | A         |      150525 |     NULL | NULL   |      | BTREE      |         |               |
    | customer |          1 | i_c_nationkey |            1 | c_nationkey | A         |          47 |     NULL | NULL   | YES  | BTREE      |         |               |
    +----------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    2 rows in set (0.08 sec)

    发现不管是count(1)或count(*)都是走的i_c_nationkey这个索引。平时我们检索数据的时候肯定是主键索引效率高,那么我们强制主键索引来看看:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    mysql> select count(*) from customer force index(PRIMARY);
    +----------+
    | count(*) |
    +----------+
    |   150000 |
    +----------+
    1 row in set (0.68 sec)
    mysql> explain select count(*) from customer force index(PRIMARY);
    +----+-------------+----------+-------+---------------+---------+---------+------+--------+-------------+
    | id | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows   | Extra       |
    +----+-------------+----------+-------+---------------+---------+---------+------+--------+-------------+
    |  1 | SIMPLE      | customer | index | NULL          | PRIMARY | 4       | NULL | 150525 | Using index |
    +----+-------------+----------+-------+---------------+---------+---------+------+--------+-------------+
    1 row in set (0.00 sec)

    可以看到走主键索引的时候效率比较差。那么是为什么呢。
    平时我们检索一列的时候,基本上等值或范围查询,那么索引基数大的索引必然效率很高。但是在做count(*)的时候并没有检索具体的一行或者一个范围。那么选择基数小的索引对
    count操作效率会更高。在做count操作的时候,mysql会遍历每个叶子节点,所以基数越小,效率越高。mysql非聚簇索引叶子节点保存的主键ID,所以需要检索两遍索引。但是这里相对于遍历主键索引。及时检索两遍索引效率也比单纯的检索主键索引快。
    那么再以一个表作为证明:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    mysql> explain select count(*) from lineitem;
    +----+-------------+----------+-------+---------------+--------------+---------+------+---------+-------------+
    | id | select_type | table    | type  | possible_keys | key          | key_len | ref  | rows    | Extra       |
    +----+-------------+----------+-------+---------------+--------------+---------+------+---------+-------------+
    |  1 | SIMPLE      | lineitem | index | NULL          | i_l_shipdate | 4       | NULL | 6008735 | Using index |
    +----+-------------+----------+-------+---------------+--------------+---------+------+---------+-------------+
    1 row in set (0.00 sec)
     
    mysql> show index from lineitem;
    +----------+------------+-----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | Table    | Non_unique | Key_name              | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
    +----------+------------+-----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | lineitem |          0 | PRIMARY               |            1 | l_orderkey    | A         |     2997339 |     NULL | NULL   |      | BTREE      |         |               |
    | lineitem |          0 | PRIMARY               |            2 | l_linenumber  | A         |     5994679 |     NULL | NULL   |      | BTREE      |         |               |
    | lineitem |          1 | i_l_shipdate          |            1 | l_shipDATE    | A         |        5208 |     NULL | NULL   | YES  | BTREE      |         |               |
    | lineitem |          1 | i_l_suppkey_partkey   |            1 | l_partkey     | A         |      428191 |     NULL | NULL   | YES  | BTREE      |         |               |
    | lineitem |          1 | i_l_suppkey_partkey   |            2 | l_suppkey     | A         |     1998226 |     NULL | NULL   | YES  | BTREE      |         |               |
    | lineitem |          1 | i_l_partkey           |            1 | l_partkey     | A         |      461129 |     NULL | NULL   | YES  | BTREE      |         |               |
    | lineitem |          1 | i_l_suppkey           |            1 | l_suppkey     | A         |       19213 |     NULL | NULL   | YES  | BTREE      |         |               |
    | lineitem |          1 | i_l_receiptdate       |            1 | l_receiptDATE | A         |          17 |     NULL | NULL   | YES  | BTREE      |         |               |
    | lineitem |          1 | i_l_orderkey          |            1 | l_orderkey    | A         |     2997339 |     NULL | NULL   |      | BTREE      |         |               |
    | lineitem |          1 | i_l_orderkey_quantity |            1 | l_orderkey    | A         |     1998226 |     NULL | NULL   |      | BTREE      |         |               |
    | lineitem |          1 | i_l_orderkey_quantity |            2 | l_quantity    | A         |     5994679 |     NULL | NULL   | YES  | BTREE      |         |               |
    | lineitem |          1 | i_l_commitdate        |            1 | l_commitDATE  | A         |        7836 |     NULL | NULL   | YES  | BTREE      |         |               |
    +----------+------------+-----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    12 rows in set (0.96 sec)
    这里一看l_shipDATE并不是基数最小的呀,殊不知这个统计信息是不准确的。我们用sql看一下。
    1
    2
    3
    4
    5
    6
    7
    mysql> select count(distinct(l_shipDATE)) from lineitem;
    +-----------------------------+
    | count(distinct(l_shipDATE)) |
    +-----------------------------+
    |                        2526 |
    +-----------------------------+
    1 row in set (0.01 sec)
    那么比他小的那些列呢?
    1
    2
    3
    4
    5
    6
    7
    mysql> select count(distinct(l_receiptDATE)) from lineitem;
    +--------------------------------+
    | count(distinct(l_receiptDATE)) |
    +--------------------------------+
    |                           2554 |
    +--------------------------------+
    1 row in set (0.01 sec)

    其他就不看了,这里再次说明mysql选择了基数小的索引。
  • 相关阅读:
    JMeter基础篇--录制web脚本(四)
    jmeter的基础使用(二)
    jmeter安装教程(一)
    delete用法(删除表内容)
    update用法(修改表内容)
    IPy对网段的处理
    netmiko
    读写conf文件
    读写json文件
    excel及数据处理
  • 原文地址:https://www.cnblogs.com/qcfeng/p/6222590.html
Copyright © 2011-2022 走看看