zoukankan      html  css  js  c++  java
  • 使用mysql5.7新特性解决前通配符查询性能问题

      众所周知,在mysql里的后通配符可以使用索引查找,前通配查询却无法使用到索引,即使是使用到了索引,也是使用了索引全扫描,效率依然不高,再MySQL5.7之前,一直都没有好的办法解决,但是到了MySQL5.7,自从有了虚拟列,这个问题就好办多了,能够已空间换时间。

      创建测试表

    root@localhost [zeno]>show create table test_userG ;
    *************************** 1. row ***************************
           Table: test_user
    Create Table: CREATE TABLE `test_user` (
      `uid` int(11) NOT NULL AUTO_INCREMENT,
      `name` varchar(32) DEFAULT NULL,
      `add_time` datetime DEFAULT NULL,
      PRIMARY KEY (`uid`),
      KEY `ix_name` (`name`)
    ) ENGINE=InnoDB AUTO_INCREMENT=6037060 DEFAULT CHARSET=utf8
    1 row in set (0.00 sec)
    
    ERROR: 
    No query specified

      使用python插入测试数据

    #!/usr/bin/python
    import string
    import random
    import MySQLdb
    import time
    
    conn = MySQLdb.connect(host='IPAddr',
                           port=3306,
                           user='zeno',
                           passwd='zeno',
                           db='zeno')
    
    def insert(para):
        i = 11
        while True:
            r_name = ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(random.randint(10, 30)))
            print r_name
           
            cursor = conn.cursor()
            cursor.execute("INSERT INTO test_user (name,add_time) VALUES ('%s', now())" % str(r_name))
            i = i + 1
            conn.commit()
            #time.sleep(0.1)
            print i  
            
    insert(conn)

      查看插入的数据量

    root@localhost [zeno]>show table status like 'test_user'G ;
    *************************** 1. row ***************************
               Name: test_user
             Engine: InnoDB
            Version: 10
         Row_format: Dynamic
               Rows: 6002441
     Avg_row_length: 51
        Data_length: 310165504
    Max_data_length: 0
       Index_length: 0
          Data_free: 5242880
     Auto_increment: 6037060
        Create_time: 2017-11-23 16:25:15
        Update_time: 2017-11-23 16:23:29
         Check_time: NULL
          Collation: utf8_general_ci
           Checksum: NULL
     Create_options: 
            Comment: 
    1 row in set (0.00 sec)
    
    ERROR: 
    No query specified
    
    root@localhost [zeno]>select * from test_user limit 10 ; 
    +-----+-------------------------------+---------------------+
    | uid | name                          | add_time            |
    +-----+-------------------------------+---------------------+
    |   1 | U0WUJ3JJ81IRP27BSA4471        | 2017-11-23 15:37:49 |
    |   2 | SOLYNM9Q9A5Y94YG              | 2017-11-23 15:37:49 |
    |   3 | ONNU5PPKXC3GBR                | 2017-11-23 15:37:49 |
    |   4 | WVC6GOJ29C                    | 2017-11-23 15:37:49 |
    |   5 | Z653X99ZZI                    | 2017-11-23 15:37:49 |
    |   6 | YP92P02DIKQ8O66K              | 2017-11-23 15:37:49 |
    |   7 | 2X3G6H8849SDP                 | 2017-11-23 15:37:49 |
    |   8 | 9N9F668XQMTRQSCNE0FWJBMMJEFC0 | 2017-11-23 15:37:50 |
    |   9 | 15XAHWZ1IJBP6P4EKCH           | 2017-11-23 15:37:50 |
    |  10 | VHQJQGQC7U                    | 2017-11-23 15:37:50 |
    +-----+-------------------------------+---------------------+
    10 rows in set (0.00 sec)

      开始测试

      一、验证查询条件中使用后通配符的情况

    root@localhost [zeno]>select * from test_user where name like '9N9F668XQ%' ;
    +-----+-------------------------------+---------------------+
    | uid | name                          | add_time            |
    +-----+-------------------------------+---------------------+
    |   8 | 9N9F668XQMTRQSCNE0FWJBMMJEFC0 | 2017-11-23 15:37:50 |
    +-----+-------------------------------+---------------------+
    1 row in set (0.00 sec)
    
    root@localhost [zeno]>explain select * from test_user where name like '9N9F668XQ%' ;
    +----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+
    | id | select_type | table     | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                 |
    +----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+
    |  1 | SIMPLE      | test_user | NULL       | range | ix_name       | ix_name | 99      | NULL |    1 |   100.00 | Using index condition |
    +----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+
    1 row in set, 1 warning (0.00 sec)

      600W的数据,执行时间0.00sec,已经是毫秒级查询了

      从执行计划中可以看出,type=range, key = 'ix_name',证明是对索引ix_name进行了范围查找,所以,能很快地得到结果

      二、验证查询条件中使用前通配符的情况

    root@localhost [zeno]>select * from test_user where name like '%WJBMMJEFC0' ;
    +-----+-------------------------------+---------------------+
    | uid | name                          | add_time            |
    +-----+-------------------------------+---------------------+
    |   8 | 9N9F668XQMTRQSCNE0FWJBMMJEFC0 | 2017-11-23 15:37:50 |
    +-----+-------------------------------+---------------------+
    1 row in set (3.84 sec)
    
    root@localhost [zeno]>explain select * from test_user where name like '%WJBMMJEFC0' ;
    +----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
    | id | select_type | table     | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
    +----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
    |  1 | SIMPLE      | test_user | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 6002441 |    11.11 | Using where |
    +----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
    1 row in set, 1 warning (0.00 sec)

      600万的数据,运行了3.84sec,速度非常慢

      从执行计划中type=‘ALL’可以看出是进行了全表扫描,扫描完之后,再根据where条件找出合适的数据

      在MySQL5.7之前,对于这种条件中使用了前通配符的查询,几乎就是束手无策,但是,MySQL5.7中增加了一项新功能,可以用较小的代价实现快速查询

      创建虚拟列

    root@localhost [zeno]>alter table test_user add r_name varchar(32) generated always as (reverse(`name`));
    Query OK, 0 rows affected (0.44 sec)
    Records: 0  Duplicates: 0  Warnings: 0

      在虚拟列上创建索引(跟一般创建索引无异)

    root@localhost [zeno]>create index ix_r_name on test_user(r_name) ; 
    Query OK, 0 rows affected (41.90 sec)
    Records: 0  Duplicates: 0  Warnings: 0

      问题来了,已经创建了虚拟列,也创建了所以,怎么实现对前通配符的快速查询呢? 

      先用一个简短的数字来说明一下思路:假设要查询的列的最终值为‘0123456789’,前通配查询的时候,条件是 name like '%6789',但是已经创建了虚拟列,虚拟列的效果是把原来的数据反转,也就是变成了‘9876543210’,那么,查询的条件变成了name like '9876%',但是,不可能是每次都要自己计算一下,把'6789'换成‘9876’

      因此,在查询的时候,还要取巧的一步,条件中再次把输入的值反转,结果如下

    root@localhost [zeno]>select * from test_user where r_name like concat(reverse('WJBMMJEFC0'),'%');
    +-----+-------------------------------+---------------------+-------------------------------+
    | uid | name                          | add_time            | r_name                        |
    +-----+-------------------------------+---------------------+-------------------------------+
    |   8 | 9N9F668XQMTRQSCNE0FWJBMMJEFC0 | 2017-11-23 15:37:50 | 0CFEJMMBJWF0ENCSQRTMQX866F9N9 |
    +-----+-------------------------------+---------------------+-------------------------------+
    1 row in set (0.00 sec)
    
    root@localhost [zeno]>explain select * from test_user where r_name like concat(reverse('WJBMMJEFC0'),'%');
    +----+-------------+-----------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
    | id | select_type | table     | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra       |
    +----+-------------+-----------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
    |  1 | SIMPLE      | test_user | NULL       | range | ix_r_name     | ix_r_name | 99      | NULL |    1 |   100.00 | Using where |
    +----+-------------+-----------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
    1 row in set, 1 warning (0.00 sec)

      从执行结果来看,效果已经达到了,600W的数据也只是执行了0.00sec

      三、在条件中同时使用了前通配符和后通配符的情况,暂时没有好的解决办法

    参考文档: 

      MySQL官方介绍虚拟列:https://dev.mysql.com/doc/refman/5.7/en/create-table-generated-columns.html

      以上,如有错谬,请不吝指正。 

      原创作品,如需转载,请标明出处,谢谢~

  • 相关阅读:
    dom4j解析带命名空间的xml文件
    Spring使用facotry-method创建单例Bean总结<转>
    代码审查工具StyleCop
    ReSharper 配置及用法(二)
    ReSharper 配置及用法(一)
    linqPad快速学习LINQ(含视频)
    评估期已过。有关如何升级的测试版软件的信息
    SQL批量更新数据库中所有用户数据表中字段类型为tinyint为int
    SQL SERVER获取数据库中所有表名 XTYPE类型
    sqlserver中创建链接服务器
  • 原文地址:https://www.cnblogs.com/cnzeno/p/7885894.html
Copyright © 2011-2022 走看看