说起mysql查询效率问题,就无法绕开索引问题,而innodb索引是mysql存储引擎中的重中之重,innodb索引包括“主键索引(聚集索引)”也就是实际数据和主键数据存储在一起的索引。innodb除了主键索引以外就是二级索引,二级索引叶子结点的数据区存储的是主键,mysql每个数据页面有16k大小,innodb在执行插入数据后就已经根据唯一键将数据页排好顺序,索引结构暂时不做过多解释。
那我们新建二级索引的时可以根据“近1”原则创建,创建是参考sql语句:select COUNT(DISTINCT(CONCAT(date,name)))/count(*) ;
例如,想要建立两个字段的联合索引,而第二个字段你可以选择使用前缀的创建方式,也就是第二个字段只使用部分数据。
比如查询优化中我们要遵循最佳做前缀原则,这个时候我们很可能想根据后缀去查询,我们可以将数据倒置后,存储冗余一列在按照左匹配模糊查询。
如网址列查询我们可以存储为:moc.udiab.www 这种形式。
按照创建索引涉及的列去重后的总数和数据总数做对比,结果越趋近于1,则索引查询效果越好。因为越趋近于0说明数据差距不大,所扫描的数据代价越高。
=========JOIN=========
先说下join吧,两个表的join操作,实质是将一张表的记录放入另一张表进行匹配过滤,mysql有join_buffer_size设置,这个设置会决定每次从驱动表中拿出多少条数据去另一张表做过滤,如果内存允许的话,尽量将其调大(默认256KB)这样会减少IO开销,每次拿驱动表中多的数据去和另一张表的数据作匹配,sql优化器会试图将两个表的操作先优化为内连接也就是inner join,这样做可以使用驱动表优化策略,会将小表设置为驱动表。
========子查询========
子查询的性能也是关键,子查询可以分为相关子查询和独立子查询;每种查询方式又分为标量子查询(返回单条记录)和结果集子查询。
(1)如:select * from td1 where name=(select a from td2 where td2.b=td1.c limit 1) 相关子查询中的标量子查询;
其实相关标量子查询,如果子查询内能够使用索引速度也是基本没什么问题的,因为它会在外层查询中先取出一条数据,然后去子查询中做唯一匹配,虽然稍复杂些,但是性能损耗可以接受
(2)如:select * from td1 where name in (select a from td2) 独立子查询;
ps:如果是上面这种则相对复杂了很多,如果子查询的返回结果数量没有超过mysql服务的预设值(tmp_table_size),则会以memory引擎创建一个td2 临时表并创建Hash索引,因为td1中是以等值方式去td2中匹配的;超过了就会创建磁盘物理表并建立B+Tree索引。
《2.1》针对于in子查询,查询优化器会将in优化为join,那么转换为join的情况则不同,独立子查询如果子查询返回数据库唯一键,则直接可转成join。
《2.2》如果子查询返回非唯一键,则会使用物化表(按照ps的规则),将结果集先去重,然后在进行join。
(3)如:select * from td1 where name in (select a from td2 where td1.a=td2.b) 相关子查询中使用了主键;
相关子查询去重后在进行semi join,使用以下命令选择开启:
semijoin={on|off} materialization={on|off} loosescan={on|off} subquery_materialization_cost_based={on|off}
4种情况:
《3.1》如果子查询中select返回是数据库唯一键 ,情况也是直接转换为semi join。
《3.2》mysql会创建一个临时表,每次将主表数据取出后在子查询表中查找数据,查找到以后插入临时表,该临时表中不会有重复的数据,当数据插入完毕以后,在与临时表做semi join。
《3.3》使用FirstMatch方法在子查询表中返回一条语句后,即返回,不在向后继续寻找。
《3.4》使用LooseScan,依赖本身的索引,进行group by进行去重后,在semi join
==================
mysql查询部分包括,(1)sql执行器,(2)sql优化器。
sql优化的执行过程可以先开启sql优化日志后(SET optimizer_trace='enabled=on';),到information_schema.OPTIMIZER_TRACE这个表中进行优化过程的查询,结合查询计划就能够看到mysql究竟是怎样执行最终的sql的。
查询优化日志中,我们可以主要观察where条件优化,join优化,和analyzing_range_alternatives查询成本优化。
这里我们可以拿analysisi_query_cost查询成本优化为例,row查询默认开销因子为0.2,主键索引数据页查询因子为1,按照公式=主键索引数据页*1+行数*0.2,最终获得查询成本cost的值,优化器会根据cost值,取最小的执行查询。
{ "steps": [ { "join_preparation": { "select#": 1, "steps": [ { "expanded_query": "/* select#1 */ select `t_day_stock`.`STOCK` AS `STOCK`,`t_day_stock`.`DATE` AS `DATE`,`t_day_stock`.`OPEN` AS `OPEN`,`t_day_stock`.`CLOSE` AS `CLOSE`,`t_day_stock`.`HIGH` AS `HIGH`,`t_day_stock`.`LOW` AS `LOW`,`t_day_stock`.`VOLUME` AS `VOLUME`,`t_day_stock`.`VALUE` AS `VALUE`,`t_day_stock`.`NUMBER_OF_TRADES` AS `NUMBER_OF_TRADES`,`t_day_stock`.`PRECLOSE` AS `PRECLOSE` from `t_day_stock` where ((`t_day_stock`.`STOCK` = '600228.SH') and (`t_day_stock`.`DATE` < '2004-01-12'))" } ] } }, { "join_optimization": { "select#": 1, "steps": [ { "condition_processing": { "condition": "WHERE", "original_condition": "((`t_day_stock`.`STOCK` = '600228.SH') and (`t_day_stock`.`DATE` < '2004-01-12'))", "steps": [ { "transformation": "equality_propagation", "resulting_condition": "((`t_day_stock`.`STOCK` = '600228.SH') and (`t_day_stock`.`DATE` < '2004-01-12'))" }, { "transformation": "constant_propagation", "resulting_condition": "((`t_day_stock`.`STOCK` = '600228.SH') and (`t_day_stock`.`DATE` < '2004-01-12'))" }, { "transformation": "trivial_condition_removal", "resulting_condition": "((`t_day_stock`.`STOCK` = '600228.SH') and (`t_day_stock`.`DATE` < '2004-01-12'))" } ] } }, { "table_dependencies": [ { "table": "`t_day_stock`", "row_may_be_null": false, "map_bit": 0, "depends_on_map_bits": [ ] } ] }, { "ref_optimizer_key_uses": [ { "table": "`t_day_stock`", "field": "STOCK", "equals": "'600228.SH'", "null_rejecting": false }, { "table": "`t_day_stock`", "field": "STOCK", "equals": "'600228.SH'", "null_rejecting": false }, { "table": "`t_day_stock`", "field": "STOCK", "equals": "'600228.SH'", "null_rejecting": false } ] }, { "rows_estimation": [ { "table": "`t_day_stock`", "range_analysis": { "table_scan": { "rows": 425496, "cost": 98608 }, "potential_range_indices": [ { "index": "PRIMARY", "usable": true, "key_parts": [ "STOCK", "DATE" ] }, { "index": "stock_date", "usable": true, "key_parts": [ "STOCK", "DATE" ] }, { "index": "stock", "usable": true, "key_parts": [ "STOCK" ] }, { "index": "date", "usable": true, "key_parts": [ "DATE" ] } ], "setup_range_conditions": [ ], "group_index_range": { "chosen": false, "cause": "not_group_by_or_distinct" }, "analyzing_range_alternatives": { "range_scan_alternatives": [ { "index": "PRIMARY", "ranges": [ "600228.SH <= STOCK <= 600228.SH AND DATE < 2004-01-12 00:00:00" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 3, "cost": 4.61, "chosen": true }, { "index": "stock_date", "ranges": [ "600228.SH <= STOCK <= 600228.SH AND DATE < 2004-01-12 00:00:00" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 5, "cost": 7.01, "chosen": false, "cause": "cost" }, { "index": "stock", "ranges": [ "600228.SH <= STOCK <= 600228.SH" ], "index_dives_for_eq_ranges": true, "rowid_ordered": true, "using_mrr": false, "index_only": false, "rows": 468, "cost": 562.61, "chosen": false, "cause": "cost" }, { "index": "date", "ranges": [ "DATE < 2004-01-12 00:00:00" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 7450, "cost": 8941, "chosen": false, "cause": "cost" } ], "analyzing_roworder_intersect": { "usable": false, "cause": "too_few_roworder_scans" } }, "chosen_range_access_summary": { "range_access_plan": { "type": "range_scan", "index": "PRIMARY", "rows": 3, "ranges": [ "600228.SH <= STOCK <= 600228.SH AND DATE < 2004-01-12 00:00:00" ] }, "rows_for_plan": 3, "cost_for_plan": 4.61, "chosen": true } } } ] }, { "considered_execution_plans": [ { "plan_prefix": [ ], "table": "`t_day_stock`", "best_access_path": { "considered_access_paths": [ { "access_type": "ref", "index": "PRIMARY", "rows": 4254, "cost": 5105.8, "chosen": true }, { "access_type": "ref", "index": "stock_date", "rows": 310, "cost": 372, "chosen": true }, { "access_type": "ref", "index": "stock", "rows": 468, "cost": 561.6, "chosen": false }, { "access_type": "range", "rows": 3, "cost": 5.21, "chosen": true } ] }, "cost_for_plan": 5.21, "rows_for_plan": 3, "chosen": true } ] }, { "attaching_conditions_to_tables": { "original_condition": "((`t_day_stock`.`STOCK` = '600228.SH') and (`t_day_stock`.`DATE` < '2004-01-12'))", "attached_conditions_computation": [ ], "attached_conditions_summary": [ { "table": "`t_day_stock`", "attached": "((`t_day_stock`.`STOCK` = '600228.SH') and (`t_day_stock`.`DATE` < '2004-01-12'))" } ] } }, { "refine_plan": [ { "table": "`t_day_stock`", "pushed_index_condition": "((`t_day_stock`.`STOCK` = '600228.SH') and (`t_day_stock`.`DATE` < '2004-01-12'))", "table_condition_attached": null, "access_type": "range" } ] } ] } }, { "join_explain": { "select#": 1, "steps": [ ] } } ] }