zoukankan      html  css  js  c++  java
  • 索引原理与慢查询优化

      一、MySQL索引管理

         1、功能

         (1). 索引的功能就是加速查找
    (2). mysql中的primary key,unique,联合唯一也都是索引,这些索引除了加速查找以外,还有约束的功能
    普通索引INDEX:加速查
    唯一索引: -主键索引PRIMARY KEY:加速查找+约束(不为空、不能重复) -唯一索引UNIQUE:加速查找+约束(不能重复) 联合索引: -PRIMARY KEY(id,name):联合主键索引 -UNIQUE(id,name):联合唯一索引 -INDEX(id,name):联合普通索引二、索引数据结构
    1.         索引字段要尽量的小:通过上面的分析,我们知道IO次数取决于b+数的高度h,
    假设当前数据表的数据为N,每个磁盘块的数据项的数量是m,则有h=㏒(m+1)N,
    当数据量N一定的情况下,m越大,h越小;而m = 磁盘块的大小 / 数据项的大小,
    磁盘块的大小也就是一个数据页的大小,是固定的,如果数据项占的空间越小,
    数据项的数量越多,树的高度越低。这就是为什么每个数据项,即索引字段要尽量的小,
    比如int占4字节,要比bigint8字节少一半。这也是为什么b+树要求把真实的数据放到叶子节点而不是内层节点,
    一旦放到内层节点,磁盘块的数据项会大幅度下降,导致树增高。当数据项等于1时将会退化成线性表。
    2. 索引的最左匹配特性:当b+树的数据项是复合的数据结构,比如(name,age,sex)的时候,
    b+数是按照从左到右的顺序来建立搜索树的,比如当(张三,20,F)这样的数据来检索的时候,
    b+树会优先比较name来确定下一步的所搜方向,如果name相同再依次比较age和sex,
    最后得到检索的数据;但当(20,F)这样的没有name的数据来的时候,b+树就不知道下一步该查哪个节点,
    因为建立搜索树的时候name就是第一个比较因子,必须要先根据name来搜索才能知道下一步去哪里查询。
    比如当(张三,F)这样的数据来检索时,b+树可以用name来指定搜索方向,但下一个字段age的缺失,
    所以只能把名字等于张三的数据都找到,然后再匹配性别是F的数据了, 这个是非常重要的性质,即索引的最左匹配特性。


    三、 创建/删除索引的语法
    #方法一:创建表时
          CREATE TABLE 表名 (
                    字段名1  数据类型 [完整性约束条件…],
                    字段名2  数据类型 [完整性约束条件…],
                    [UNIQUE | FULLTEXT | SPATIAL ]   INDEX | KEY
                    [索引名]  (字段名[(长度)]  [ASC |DESC]) 
                    );
    
    
    #方法二:CREATE在已存在的表上创建索引
            CREATE  [UNIQUE | FULLTEXT | SPATIAL ]  INDEX  索引名 
                         ON 表名 (字段名[(长度)]  [ASC |DESC]) ;
    
    
    #方法三:ALTER TABLE在已存在的表上创建索引
            ALTER TABLE 表名 ADD  [UNIQUE | FULLTEXT | SPATIAL ] INDEX
                                 索引名 (字段名[(长度)]  [ASC |DESC]) ;
                                 
    #删除索引:DROP INDEX 索引名 ON 表名字;

    1 创建索引
    - 在创建表时就创建
    create table s1(
    id int,
    name char(6),
    age int,
    email varchar(30),
    index(id)
    );
    - 在创建表后创建
    create index name on s1(name);#添加普通索引
    create unique index age on s1(age);#添加唯一索引
    alter table s1 add primary key(id);#添加主键索引
    create index name on s1(id,name);#添加联合普通索引

    2 删除索引
    drop index id on s1;
    drop index name on s1;
    alter table s1 drop primary key;#添加主键索引

    四、 测试索引

    1、

    #1. 准备表
    create table s1(
    id int,
    name varchar(20),
    gender char(6),
    email varchar(50)
    );
    
    #2. 创建存储过程,实现批量插入记录
    delimiter $$   #声明存储过程的结束符号为$$
    create procedure auto_insert1()
    BEGIN
        declare i int default 1;
        while(i<300000)do
            insert into s1 values(i,concat('egon',i),'male',concat('egon',i,'@oldboy'));
            set i=i+1;
        end while;
    END$$ 
    delimiter ; #重新声明分号为结束符号
    
    #3. 查看存储过程
    show create procedure auto_insert1G 
    
    #4. 调用存储过程
    call auto_insert1();

    2 、在没有索引的前提下测试查询速度

    #无索引:从头到尾扫描一遍,所以查询速度很慢


        加上索引

    五、

    1、若想利用索引达到预想的提高查询速度的效果,我们在添加索引时,必须遵循以下原则

    #1.最左前缀匹配原则,非常重要的原则,
    create index ix_name_email on s1(name,email,)
    - 最左前缀匹配:必须按照从左到右的顺序匹配
    select * from s1 where name='egon'; #可以
    select * from s1 where name='egon' and email='asdf'; #可以
    select * from s1 where email='alex@oldboy.com'; #不可以
    mysql会一直向右匹配直到遇到范围查询(>、<、between、like)就停止匹配,比如a = 1 and b = 2 and c > 3 and d = 4 如果建立(a,b,c,d)顺序的索引,d是用不到索引的,如果建立(a,b,d,c)的索引则都可以用到,a,b,d的顺序可以任意调整。
    
    #2.=和in可以乱序,比如a = 1 and b = 2 and c = 3 建立(a,b,c)索引可以任意顺序,mysql的查询优化器会帮你优化成索引可以识别的形式
    
    #3.尽量选择区分度高的列作为索引,区分度的公式是count(distinct col)/count(*),表示字段不重复的比例,比例越大我们扫描的记录数越少,唯一键的区分度是1,而一些状态、性别字段可能在大数据面前区分度就是0,那可能有人会问,这个比例有什么经验值吗?使用场景不同,这个值也很难确定,一般需要join的字段我们都要求是0.1以上,即平均1条扫描10条记录
    
    #4.索引列不能参与计算,保持列“干净”,比如from_unixtime(create_time) = ’2014-05-29’就不能使用到索引,原因很简单,b+树中存的都是数据表中的字段值,但进行检索时,需要把所有元素都应用函数才能比较,显然成本太大。所以语句应该写成create_time = unix_timestamp(’2014-05-29’);

    2、最左前缀示范

    1 加索引提速:范围
    mysql> select count(*) from s1 where id=1000;
    +----------+
    | count(*) |
    +----------+
    |        1 |
    +----------+
    1 row in set (0.12 sec)
    
    mysql> select count(*) from s1 where id>1000;
    +----------+
    | count(*) |
    +----------+
    |   298999 |
    +----------+
    1 row in set (0.12 sec)
    
    mysql> create index a on s1(id)
        -> ;
    Query OK, 0 rows affected (3.21 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql> select count(*) from s1 where id=1000;
    +----------+
    | count(*) |
    +----------+
    |        1 |
    +----------+
    1 row in set (0.00 sec)
    
    mysql> select count(*) from s1 where id>1000;
    +----------+
    | count(*) |
    +----------+
    |   298999 |
    +----------+
    1 row in set (0.12 sec)
    
    mysql> select count(*) from s1 where id>1000 and id < 2000;
    +----------+
    | count(*) |
    +----------+
    |      999 |
    +----------+
    1 row in set (0.00 sec)
    
    mysql> select count(*) from s1 where id>1000 and id < 300000;
    +----------+
    | count(*) |
    +----------+
    |   298999 |
    +----------+
    1 row in set (0.13 sec)
    
    
    
    3 区分度低的字段不能加索引
    mysql> select count(*) from s1 where name='xxx';
    +----------+
    | count(*) |
    +----------+
    |        0 |
    +----------+
    1 row in set (0.00 sec)
    
    mysql> select count(*) from s1 where name='egon';
    +----------+
    | count(*) |
    +----------+
    |   299999 |
    +----------+
    1 row in set (0.19 sec)
    
    
    mysql> select count(*) from s1 where name='egon' and 
    
    age=123123123123123;
    +----------+
    | count(*) |
    +----------+
    |        0 |
    +----------+
    1 row in set (0.45 sec)
    
    mysql> create index c on s1(age);
    Query OK, 0 rows affected (3.03 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql> select count(*) from s1 where name='egon' and 
    
    age=123123123123123;
    +----------+
    | count(*) |
    +----------+
    |        0 |
    +----------+
    1 row in set (0.00 sec)
    
    mysql> select count(*) from s1 where name='egon' and age=10;
    +----------+
    | count(*) |
    +----------+
    |   299999 |
    +----------+
    1 row in set (0.35 sec)
    
    
    mysql> select count(*) from s1 where name='egon' and age=10 and 
    
    id>3000 and id < 4000;
    +----------+
    | count(*) |
    +----------+
    |      999 |
    +----------+
    1 row in set (0.00 sec)
    
    
    mysql> select count(*) from s1 where name='egon' and age=10 and 
    
    id>3000 and email='xxxx';
    +----------+
    | count(*) |
    +----------+
    |        0 |
    +----------+
    1 row in set (0.47 sec)
    
    mysql> create index d on s1(email);
    Query OK, 0 rows affected (4.83 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql> select count(*) from s1 where name='egon' and age=10 and 
    
    id>3000 and email='xxxx';
    +----------+
    | count(*) |
    +----------+
    |        0 |
    +----------+
    1 row in set (0.00 sec)
    
    mysql> drop index a on s1;
    Query OK, 0 rows affected (0.10 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql> drop index b on s1;
    Query OK, 0 rows affected (0.09 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql> drop index c on s1;
    Query OK, 0 rows affected (0.09 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql> desc s1;
    +-------+-------------+------+-----+---------+-------+
    | Field | Type        | Null | Key | Default | Extra |
    +-------+-------------+------+-----+---------+-------+
    | id    | int(11)     | NO   |     | NULL    |       |
    | name  | char(20)    | YES  |     | NULL    |       |
    | age   | int(11)     | YES  |     | NULL    |       |
    | email | varchar(30) | YES  | MUL | NULL    |       |
    +-------+-------------+------+-----+---------+-------+
    4 rows in set (0.00 sec)
    
    mysql> select count(*) from s1 where name='egon' and age=10 and 
    
    id>3000 and email='xxxx';
    +----------+
    | count(*) |
    +----------+
    |        0 |
    +----------+
    1 row in set (0.00 sec)
    
    5 增加联合索引,关于范围查询的字段要放到后面
     select count(*) from s1 where name='egon' and age=10 and id>3000 
    
    and email='xxxx';
    index(name,email,age,id)
    
     select count(*) from s1 where name='egon' and age> 10 and 
    
    id=3000 and email='xxxx';
    index(name,email,id,age)
    
     select count(*) from s1 where name like 'egon' and age= 10 and 
    
    id=3000 and email='xxxx';
    index(email,id,age,name)
    
    
    mysql> desc s1;
    +-------+-------------+------+-----+---------+-------+
    | Field | Type        | Null | Key | Default | Extra |
    +-------+-------------+------+-----+---------+-------+
    | id    | int(11)     | NO   |     | NULL    |       |
    | name  | char(20)    | YES  |     | NULL    |       |
    | age   | int(11)     | YES  |     | NULL    |       |
    | email | varchar(30) | YES  |     | NULL    |       |
    +-------+-------------+------+-----+---------+-------+
    4 rows in set (0.00 sec)
    
    mysql> create index xxx on s1(age,email,name,id);
    Query OK, 0 rows affected (6.89 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql> select count(*) from s1 where name='egon' and age=10 and 
    
    id>3000 and email='xxxx';
    +----------+
    | count(*) |
    +----------+
    |        0 |
    +----------+
    1 row in set (0.00 sec)
    
    6. 最左前缀匹配
    index(id,age,email,name)
    #条件中一定要出现id
    id
    id age
    id email
    id name
    
    email #不行
    mysql> select count(*) from s1 where id=3000;
    +----------+
    | count(*) |
    +----------+
    |        1 |
    +----------+
    1 row in set (0.11 sec)
    
    mysql> create index xxx on s1(id,name,age,email);
    Query OK, 0 rows affected (6.44 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql>  select count(*) from s1 where id=3000;
    +----------+
    | count(*) |
    +----------+
    |        1 |
    +----------+
    1 row in set (0.00 sec)
    
    mysql>  select count(*) from s1 where name='egon';
    +----------+
    | count(*) |
    +----------+
    |   299999 |
    +----------+
    1 row in set (0.16 sec)
    
    mysql>  select count(*) from s1 where 
    
    email='egon3333@oldboy.com';
    +----------+
    | count(*) |
    +----------+
    |        1 |
    +----------+
    1 row in set (0.15 sec)
    
    mysql>  select count(*) from s1 where id=1000 and 
    
    email='egon3333@oldboy.com';
    +----------+
    | count(*) |
    +----------+
    |        0 |
    +----------+
    1 row in set (0.00 sec)
    
    mysql>  select count(*) from s1 where email='egon3333@oldboy.com' 
    
    and id=3000;
    +----------+
    | count(*) |
    +----------+
    |        0 |
    +----------+
    1 row in set (0.00 sec)
    
    
    
    
    
    
    
    6.索引列不能参与计算,保持列“干净”
    
    mysql> select count(*) from s1 where id=3000;
    +----------+
    | count(*) |
    +----------+
    |        1 |
    +----------+
    1 row in set (0.11 sec)
    
    mysql> create index xxx on s1(id,name,age,email);
    Query OK, 0 rows affected (6.44 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql>  select count(*) from s1 where id=3000;
    +----------+
    | count(*) |
    +----------+
    |        1 |
    +----------+
    1 row in set (0.00 sec)
    
    mysql>  select count(*) from s1 where name='egon';
    +----------+
    | count(*) |
    +----------+
    |   299999 |
    +----------+
    1 row in set (0.16 sec)
    
    mysql>  select count(*) from s1 where 
    
    email='egon3333@oldboy.com';
    +----------+
    | count(*) |
    +----------+
    |        1 |
    +----------+
    1 row in set (0.15 sec)
    
    mysql>  select count(*) from s1 where id=1000 and 
    
    email='egon3333@oldboy.com';
    +----------+
    | count(*) |
    +----------+
    |        0 |
    +----------+
    1 row in set (0.00 sec)
    
    mysql>  select count(*) from s1 where email='egon3333@oldboy.com' 
    
    and id=3000;
    +----------+
    | count(*) |
    +----------+
    |        0 |
    +----------+
    1 row in set (0.00 sec)

    其他注意事项

    复制代码
    - 避免使用select *
    - count(1)或count(列) 代替 count(*)
    - 创建表时尽量时 char 代替 varchar
    - 表的字段顺序固定长度的字段优先
    - 组合索引代替多个单列索引(经常使用多个条件查询时)
    - 尽量使用短索引
    - 使用连接(JOIN)来代替子查询(Sub-Queries)
    - 连表时注意条件类型需一致
    - 索引散列值(重复少)不适合建索引,例:性别不适合
    
    
    
  • 相关阅读:
    利用systemtap学习Linux路由代码
    系统调试技巧
    linux调度器系列
    systemTAP 学习
    使用systemtap调试linux内核
    二叉树的可视化---数据结构
    linux进程调度之 FIFO 和 RR 调度策略---SYSTEMTAP
    python-gdb
    中国科技大学编绎原理视频 公开课网站
    java EE 学习
  • 原文地址:https://www.cnblogs.com/mengqingjian/p/7511894.html
Copyright © 2011-2022 走看看