zoukankan html css js c++ java

MYSQL——索引原理与慢查询优化

一、索引原理与慢查询优化

ps:数据都是存在与硬盘上的，查询数据不可避免的需要进行IO操作

1、索引:就是一种数据结构，类似于书的目录。意味着以后在查询数据的应该先找目录再找数据，而不是一页一页的翻书，从而提升查询速度降低IO操作

2、索引在MySQL中也叫“键”,是存储引擎用于快速查找记录的一种数据结构

　　  * primary key
　　  * unique key
　　 * index key

  注意： foreign key不是用来加速查询用的，不在我们的而研究范围之内

上面的三种key，前面两种除了可以增加查询速度之外各自还具有约束条件，而最后一种index key没有任何的约束条件，只是用来帮助你快速查询数据

3、本质

通过不断的缩小想要的数据范围筛选出最终的结果，同时将随机事件(一页一页的翻)
变成顺序事件(先找目录、再找数据)
也就是说有了索引机制，我们可以总是用一种固定的方式查找数据
4、一张表中可以有多个索引(多个目录)

5、索引虽然能够帮助你加快查询速度但是也有缺点

　　1 当表中有大量数据存在的前提下创建索引速度会很慢
　　2 在索引创建完毕之后对表的查询性能会大幅度的提升但是写的性能也会大幅度的降低

ps:索引不要随意的创建！！！

6、b+树

为什么说B+树比B树更适合做操作系统的数据库索引和文件索引？

（1）B+树的磁盘读写的代价更低

B+树内部结点没有指向关键字具体信息的指针，这样内部结点相对B树更小。

（2）B+树的查询更加的稳定

因为非终端结点并不是最终指向文件内容的结点，仅仅是作为叶子结点中关键字的索引。这样所有的关键字的查找都会走一条从根结点到叶子结点的路径。所有的关键字查询长度都是相同的，查询效率相当。

### b+树

  """
  只有叶子节点存放的是真实的数据 其他节点存放的是虚拟数据 仅仅是用来指路的
  树的层级越高查询数据所需要经历的步骤就越多(树有几层查询数据就需要几步)，所以要想查询快，树的层数越少越好
  
  一个磁盘块存储是有限制的
  为什么建议你将id字段作为索引
      占得空间少 一个磁盘块能够存储的数据多
      那么久降低了树的高度 从而减少查询次数
  """
 

  ### 聚集索引(primary key)

  """
  聚集索引指的就是主键 
  Innodb  只有两个文件  直接将主键存放在了idb表中 
  MyIsam  三个文件  单独将索引存在一个文件
　叶子节点放的是一条条完整的数据
  """

  ### 辅助索引(unique,index)

  查询数据的时候不可能一直使用到主键，也有可能会用到name,password等其他字段

  那么这个时候你是没有办法利用聚集索引。这个时候你就可以根据情况给其他字段设置辅助索引(也是一个b+树）


  """
  叶子节点存放的是数据对应的主键值
      先按照辅助索引拿到数据的主键值
      之后还是需要去主键的聚集索引里面查询数据
  """
  

  ### 覆盖索引

  在辅助索引的叶子节点就已经拿到了需要的数据

  
  # 给name设置辅助索引
  select name from user where name='jason';
  # 非覆盖索引
  select age from user where name='jason';

7、测试索引是否有效的代码

 #1. 准备表
  create table s1(
  id int,
  name varchar(20),
  gender char(6),
  email varchar(50)
  );
  
  #2. 创建存储过程，实现批量插入记录
  delimiter $$ #声明存储过程的结束符号为$$
  create procedure auto_insert1()
  BEGIN
      declare i int default 1;
      while(i<3000000)do
          insert into s1 values(i,'jason','male',concat('jason',i,'@oldboy'));
          set i=i+1;
      end while;
  END$$ #$$结束
  delimiter ; #重新声明分号为结束符号
  
  #3. 查看存储过程
  show create procedure auto_insert1G 
  
  #4. 调用存储过程
  call auto_insert1();
  ```
  
  ``` mysql 
  # 表没有任何索引的情况下
  select * from s1 where id=30000;
  # 避免打印带来的时间损耗
  select count(id) from s1 where id = 30000;
  select count(id) from s1 where id = 1;
  
  # 给id做一个主键
  alter table s1 add primary key(id);  # 速度很慢
  
  select count(id) from s1 where id = 1;  # 速度相较于未建索引之前两者差着数量级
  select count(id) from s1 where name = 'jason'  # 速度仍然很慢
  
  
  """
  范围问题
  """
  # 并不是加了索引，以后查询的时候按照这个字段速度就一定快   
  select count(id) from s1 where id > 1;  # 速度相较于id = 1慢了很多
  select count(id) from s1 where id >1 and id < 3;
  select count(id) from s1 where id > 1 and id < 10000;
  select count(id) from s1 where id != 3;
  
  alter table s1 drop primary key;  # 删除主键 单独再来研究name字段
  select count(id) from s1 where name = 'jason';  # 又慢了
  
  create index idx_name on s1(name);  # 给s1表的name字段创建索引
  select count(id) from s1 where name = 'jason'  # 仍然很慢！！！
  """
  再来看b+树的原理，数据需要区分度比较高，而我们这张表全是jason，根本无法区分
  那这个树其实就建成了“一根棍子”
  """
  select count(id) from s1 where name = 'xxx';  
  # 这个会很快，我就是一根棍，第一个不匹配直接不需要再往下走了
  select count(id) from s1 where name like 'xxx';
  select count(id) from s1 where name like 'xxx%';
  select count(id) from s1 where name like '%xxx';  # 慢 最左匹配特性
  
  # 区分度低的字段不能建索引
  drop index idx_name on s1;
  
  # 给id字段建普通的索引
  create index idx_id on s1(id);
  select count(id) from s1 where id = 3;  # 快了
  select count(id) from s1 where id*12 = 3;  # 慢了  索引的字段一定不要参与计算
  
  drop index idx_id on s1;
  select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';
  # 针对上面这种连续多个and的操作，mysql会从左到右先找区分度比较高的索引字段，先将整体范围降下来再去比较其他条件
  create index idx_name on s1(name);
  select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';  # 并没有加速
  
  drop index idx_name on s1;
  # 给name，gender这种区分度不高的字段加上索引并不难加快查询速度
  
  create index idx_id on s1(id);
  select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';  # 快了  先通过id已经讲数据快速锁定成了一条了
  select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 慢了  基于id查出来的数据仍然很多，然后还要去比较其他字段
  
  drop index idx_id on s1
  
  create index idx_email on s1(email);
  select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 快 通过email字段一剑封喉 
#### 联合索引

  select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  
  # 如果上述四个字段区分度都很高，那给谁建都能加速查询
  # 给email加然而不用email字段
  select count(id) from s1 where name='jason' and gender = 'male' and id > 3; 
  # 给name加然而不用name字段
  select count(id) from s1 where gender = 'male' and id > 3; 
  # 给gender加然而不用gender字段
  select count(id) from s1 where id > 3; 
  
  # 带来的问题是所有的字段都建了索引然而都没有用到，还需要花费四次建立的时间
  create index idx_all on s1(email,name,gender,id);  # 最左匹配原则，区分度高的往左放
  select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 速度变快

8、查询优化神器-explain

执行计划：让mysql预估执行操作(一般正确)
    all < index < range < index_merge < ref_or_null < ref < eq_ref < system/const
    id,email
    
    慢：
        select * from userinfo3 where name='alex'
        
        explain select * from userinfo3 where name='alex'
        type: ALL(全表扫描)
            select * from userinfo3 limit 1;
    快：
        select * from userinfo3 where email='alex'
        type: const(走索引)

9、慢查询优化的基本步骤

0.先运行看看是否真的很慢，注意设置SQL_NO_CACHE
1.where条件单表查，锁定最小返回记录表。这句话的意思是把查询语句的where都应用到表中返回的记录数最小的表开始查起，单表每个字段分别查询，看哪个字段的区分度最高
2.explain查看执行计划，是否与1预期一致（从锁定记录较少的表开始查询）
3.order by limit 形式的sql语句让排序的表优先查
4.了解业务方使用场景
5.加索引时参照建索引的几大原则
6.观察结果，不符合预期继续从0分析

10、慢日志管理

慢日志
            - 执行时间 > 10
            - 未命中索引
            - 日志文件路径
            
        配置：
            - 内存
                show variables like '%query%';
                show variables like '%queries%';
                set global 变量名 = 值
            - 配置文件
                mysqld --defaults-file='E:wupeiqimysql-5.7.16-winx64mysql-5.7.16-winx64my-default.ini'
                
                my.conf内容：
                    slow_query_log = ON
                    slow_query_log_file = D:/....
                    
                注意：修改配置文件之后，需要重启服务

一、

二、

三、

四、

五、

查看全文

相关阅读:
bzoj3994:[SDOI2015]约数个数和
 数列分块1-9
luogu P2059 [JLOI2013]卡牌游戏
 luogu P1623 [CEOI2007]树的匹配Treasury
博弈论与SG函数
 luogu P1169 [ZJOI2007]棋盘制作
 luogu P1623 [CEOI2007]树的匹配Treasury
[BZOJ4896][THUSC2016]补退选(Trie)
[BZOJ3745][COCI2015]Norma(分治)
[BZOJ5006][LOJ#2290][THUWC2017]随机二分图(概率+状压DP)

原文地址：https://www.cnblogs.com/guojieying/p/13644449.html