zoukankan      html  css  js  c++  java
  • 找到 mysql 数据库中的不良索引

    为了演示,首先建两个包含不良索引的表,并弄点数据。

    1. mysql> show create table test1G
    2. *************************** 1. row ***************************
    3. Table: test1
    4. Create Table: CREATE TABLE `test1` (
    5. `id` int(11) NOT NULL,
    6. `f1` int(11) DEFAULT NULL,
    7. `f2` int(11) DEFAULT NULL,
    8. `f3` int(11) DEFAULT NULL,
    9. PRIMARY KEY (`id`),
    10. KEY `k1` (`f1`,`id`),
    11. KEY `k2` (`id`,`f1`),
    12. KEY `k3` (`f1`),
    13. KEY `k4` (`f1`,`f3`),
    14. KEY `k5` (`f1`,`f3`,`f2`)
    15. ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    16. 1 row in set (0.00 sec)
    17. mysql> show create table test2G
    18. *************************** 1. row ***************************
    19. Table: test2
    20. Create Table: CREATE TABLE `test2` (
    21. `id1` int(11) NOT NULL DEFAULT '0',
    22. `id2` int(11) NOT NULL DEFAULT '0',
    23. `b` int(11) DEFAULT NULL,
    24. PRIMARY KEY (`id1`,`id2`),
    25. KEY `k1` (`b`)
    26. ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    27. 1 row in set (0.00 sec)
    28. mysql> select count(*) from test2 group by b;
    29. +----------+
    30. | count(*) |
    31. +----------+
    32. | 32 |
    33. | 17 |
    34. +----------+
    35. 2 rows in set (0.00 sec)

    1. 包含主键的索引

    innodb 本身是聚簇表,每个二级索引本身就包含主键,类似 f1, id 的索引实际虽然没有害处,但反映了使用者对 mysql 索引不了解。而类似 id, f1 的是多余索引,会浪费存储空间,并影响数据更新性能。包含主键的索引用这样一句 sql 就能全部找出来。

    1. mysql> select c.*, pk from
    2. -> (select table_schema, table_name, index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols
    3. -> from INFORMATION_SCHEMA.STATISTICS
    4. -> where index_name != 'PRIMARY' and table_schema != 'mysql'
    5. -> group by table_schema, table_name, index_name) c,
    6. -> (select table_schema, table_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') pk
    7. -> from INFORMATION_SCHEMA.STATISTICS
    8. -> where index_name = 'PRIMARY' and table_schema != 'mysql'
    9. -> group by table_schema, table_name) p
    10. -> where c.table_name = p.table_name and c.table_schema = p.table_schema and c.cols like concat('%', pk, '%');
    11. +--------------+------------+------------+---------+------+
    12. | table_schema | table_name | index_name | cols | pk |
    13. +--------------+------------+------------+---------+------+
    14. | test | test1 | k1 | |f1|id| | |id| |
    15. | test | test1 | k2 | |id|f1| | |id| |
    16. +--------------+------------+------------+---------+------+
    17. 2 rows in set (0.04 sec)

    2. 重复索引前缀

    包含重复前缀的索引,索引能由另一个包含该前缀的索引完全代替,是多余索引。多余的索引会浪费存储空间,并影响数据更新性能。这样的索引同样用一句 sql 可以找出来。

    1. mysql> select c1.table_schema, c1.table_name, c1.index_name,c1.cols,c2.index_name, c2.cols from
    2. -> (select table_schema, table_name, index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols
    3. -> from INFORMATION_SCHEMA.STATISTICS
    4. -> where table_schema != 'mysql' and index_name!='PRIMARY'
    5. -> group by table_schema,table_name,index_name) c1,
    6. -> (select table_schema, table_name,index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols
    7. -> from INFORMATION_SCHEMA.STATISTICS
    8. -> where table_schema != 'mysql' and index_name != 'PRIMARY'
    9. -> group by table_schema, table_name, index_name) c2
    10. -> where c1.table_name = c2.table_name and c1.table_schema = c2.table_schema and c1.cols like concat(c2.cols, '%') and c1.index_name != c2.index_name;
    11. +--------------+------------+------------+------------+------------+---------+
    12. | table_schema | table_name | index_name | cols | index_name | cols |
    13. +--------------+------------+------------+------------+------------+---------+
    14. | test | test1 | k1 | |f1|id| | k3 | |f1| |
    15. | test | test1 | k4 | |f1|f3| | k3 | |f1| |
    16. | test | test1 | k5 | |f1|f3|f2| | k3 | |f1| |
    17. | test | test1 | k5 | |f1|f3|f2| | k4 | |f1|f3| |
    18. +--------------+------------+------------+------------+------------+---------+
    19. 4 rows in set (0.02 sec)

    3. 低区分度索引

    这样的索引由于仍然会扫描大量记录,在实际查询时通常会被忽略。但是在某些情况下仍然是有用的。因此需要根据实际情况进一步分析。这里是区分度小于 10% 的索引,可以根据需要调整参数。

    1. mysql> select p.table_schema, p.table_name, c.index_name, c.car, p.car total from
    2. -> (select table_schema, table_name, index_name, max(cardinality) car
    3. -> from INFORMATION_SCHEMA.STATISTICS
    4. -> where index_name != 'PRIMARY'
    5. -> group by table_schema, table_name,index_name) c,
    6. -> (select table_schema, table_name, max(cardinality) car
    7. -> from INFORMATION_SCHEMA.STATISTICS
    8. -> where index_name = 'PRIMARY' and table_schema != 'mysql'
    9. -> group by table_schema,table_name) p
    10. -> where c.table_name = p.table_name and c.table_schema = p.table_schema and p.car > 0 and c.car / p.car < 0.1;
    11. +--------------+------------+------------+------+-------+
    12. | table_schema | table_name | index_name | car | total |
    13. +--------------+------------+------------+------+-------+
    14. | test | test2 | k1 | 4 | 49 |
    15. +--------------+------------+------------+------+-------+
    16. 1 row in set (0.04 sec)

    4. 复合主键

    由于 innodb 是聚簇表,每个二级索引都会包含主键值。复合主键会造成二级索引庞大,而影响二级索引查询性能,并影响更新性能。同样需要根据实际情况进一步分析。

    1. mysql> select table_schema, table_name, group_concat(column_name order by seq_in_index separator ',') cols, max(seq_in_index) len
    2. -> from INFORMATION_SCHEMA.STATISTICS
    3. -> where index_name = 'PRIMARY' and table_schema != 'mysql'
    4. -> group by table_schema, table_name having len>1;
    5. +--------------+------------+-----------------------------------+------+
    6. | table_schema | table_name | cols | len |
    7. +--------------+------------+-----------------------------------+------+
    8. | test | test2 | id1,id2 | 2 |
    9. +--------------+------------+-----------------------------------+------+
    10. 1 rows in set (0.01 sec)

    (题图来自:webfish.se)

      码农必须要加班?NO!

      知道码农们都想摆脱加班狗、外卖脸的称号,所以我们来了!

      我们做了一个能让程序员之间共享知识技能的APP,觉得可以颠覆程序员的工作方
    式!

      有人说我们痴心妄想,但我们不那么认为。

      为了能煽烂说我们痴心妄想的人的脸,现在我们急需程序员业内的牛哔-人物来给
    我们“号脉”!“诊断费”丰厚!毕竟我们不差钱儿,只是想做到最好!

      圈圈字典中讲到,牛哔-人物是指群成员数高于1000人的QQ群主或关注人数高于
    2000人的贴吧吧主或粉丝人数高于10000人的微博博主或成员数高于2000主题贴的版主
    或单帖阅读量高于2000博客主或人脉超级广的圈内红人。

      对于未能达标的未来大神们,我们只能含泪表示:蜀黍,咱们来日方长,这次暂
    时不约好吗?待他日你立地成神,我必生死相依!

      来?还是不来?

      圈圈互动 接头暗号:1955246408 (QQ)

  • 相关阅读:
    Jessica's Reading Problem POJ
    FatMouse and Cheese HDU
    How many ways HDU
    Humble Numbers HDU
    Doing Homework again
    Stacks of Flapjacks UVA
    Party Games UVA
    24. 两两交换链表中的节点
    面试题 03.04. 化栈为队
    999. 可以被一步捕获的棋子数
  • 原文地址:https://www.cnblogs.com/starliu/p/4746826.html
Copyright © 2011-2022 走看看