zoukankan html css js c++ java

有索引，为什么Oracle还是选择全表扫描

一、概述

sql语句执行慢，大部分情况下建个索引就快了，但有些时候索引好像不起作用，这是什么原因导致的呢？结合日常经验，我总结了以下索引用不到的情况。

返回的行数比例大
不等于，not in，is null
列上有运算
列上有函数
隐式转换
列在组合索引的中间或右边
统计信息不正确
索引失效

二、环境准备

oracle 11g数据库

create table test as select * from dba_objects;
update test set generated = null where object_id < 20;
alter table test add str_id varchar2(10);
update test set str_id = object_id;
create index idx_1 on test(object_id);
create index idx_2 on test(temporary);
create index idx_3 on test(generated);
create index idx_4 on test(created);
create index idx_5 on test(str_id);
create index idx_6 on test(object_type,data_object_id);
create index idx_7 on test(namespace);

begin
dbms_stats.gather_table_stats(ownname => 'SCOTT',
tabname => 'TEST',
estimate_percent => dbms_stats.auto_sample_size,
cascade => true,  
method_opt => 'for all columns size auto',  
no_invalidate => false);
end;
/

begin
dbms_stats.gather_table_stats(ownname => 'SCOTT',
tabname => 'TEST',
estimate_percent => dbms_stats.auto_sample_size,
cascade => true,  
method_opt => 'for columns temporary size 2, namespace size 1',  
no_invalidate => false);
end;
/

三、开始实验

1) 返回的行数比例大

返回行数多的不走索引

a. select * from test where object_id > 20000;  // 索引
b. select * from test where object_id > 20;     // 全表

select count(*) from test; // 13544行
select count(*) from test where object_id > 20; // 13525行
select count(*) from test where object_id > 20000;  // 0行

返回的行数除以表的总行数的值越小，表示列的选择率越高，使用索引的概率就越大

2) 不等于，not in，is null

test表的temporary列只有两个值（'Y','N‘），以下两条sql返回的结果一样，<>不走索引

a. select * from test where temporary = 'Y';   // 索引
b. select * from test where temporary <> 'N';  // 全表

以下两条sql返回的结果一样，not in不走索引

c. select * from test where temporary in ('Y');      // 索引
d. select * from test where temporary not in ('N');  // 全表

generated上面有索引，而且返回出来的行也很少，但是is null不走索引

select count(*) from test where generated is null;  // 18行
e. select * from test where generated is null;  // 全表

3) 列上有运算

以下两条sql返回的结果一样，但列上有运算的不走索引

a. select * from test where object_id + 1 > 20000;   // 全表 
b. select * from test where object_id  > 20000 - 1;  // 索引

4) 列上有函数

以下两条sql执行结果一样，但列上有函数的不走索引

a. select * from test where to_char(CREATED, 'yyyy-mm-dd hh24:mi:ss') = '2021-04-01 22:24:03';  // 全表
b. select * from test where CREATED = to_date('2021-04-01 22:24:03', 'yyyy-mm-dd hh24:mi:ss');  // 索引

5) 隐式转换

以下两条sql结果一样，发生隐式转换的不走索引

a. select * from test where str_id = '3';   // 索引
b. select * from test where str_id = 3;     // 全表

当比较一个字符型和数值型的值时，ORACLE会把字符型的值隐式转换为数值型，执行计划里面显示filter(TO_NUMBER("STR_ID")=3)，对应就是这个列上自动加了to_number函数

如果列是数字类型呢，以下两个sql都会走索引，这是因为'3'会变成3，没有发生列的隐式转换

c. select * from test where object_id = '3';  // 索引
d. select * from test where object_id = 3;    // 索引

平常在编写sql时需要留意字段的类型，避免隐式转换。

6) 列在组合索引的中间或右边

data_object_id上有索引，而且返回的行数也很少，但是由于该列值不在索引的左边，所以无法走索引

a. select count(*) from test where data_object_id = 3;  // 1行
b. select * from test where data_object_id = 3;  // 全表

7) 统计信息不正确

select count(*), namespace from test group by namespace;

  COUNT(*)  NAMESPACE
---------- ----------
     10983          1
        11         51
       683          2
         3         21
      1575          4
        10          5
        35         24
       190          8
         9         38
        13         23
         1         64
         6          3
         1         36
        21         10
         3          9

当查询条件namspace=1时，返回10983行，以下sql明显应该走全表扫描，但是却误走了索引

a. select * from test where namespace = 1;  // 索引

重新收集列的统计信息

begin
dbms_stats.gather_table_stats(ownname => 'SCOTT',
tabname => 'TEST',
estimate_percent => dbms_stats.auto_sample_size,
cascade => true,  
method_opt => 'for columns namespace size auto',  
no_invalidate => false);
end;
/

现在执行计划就正确了，走全表，因为返回的行数多

b. select * from test where namespace = 1;  // 全表

以下是查统计信息sql

表 select table_name,num_rows,blocks,partitioned,tablespace_name,last_analyzed from user_tables where table_name=trim(upper('&table_name'));
列 select table_name,column_name,num_distinct,num_nulls,num_buckets,histogram,last_analyzed from user_tab_col_statistics where table_name=trim(upper('&table_name')) order by 1,2,3;

8) 索引无效

索引有时候会因为数据库的操作而失效，例如移动表

alter table test move;

以下sql应走索引，却走了全表

a. select * from test where object_id = 3;  // 全表

查看索引状态都是UNUSABLE

select a.*,b.status,b.last_analyzed from
(select table_name,index_name,listagg(column_name,',') within group(order by column_position) as columns from user_ind_columns group by table_name,index_name) a,
(select table_name,index_name,status,last_analyzed from user_indexes) b
where a.table_name=b.table_name and a.index_name=b.index_name and a.table_name=trim(upper('&table_name')) order by 1,2,3;

重建索引

alter index idx_1 rebuild online;

再次执行sql，执行计划就正确了，走索引

b. select * from test where object_id = 3;  // 索引

四、总结

我们在编写sql的时候，首要考虑sql执行的准确性，其次也需要考虑sql执行的效率。
并不是索引越多越好，使用索引也并不一定会加快sql的查询效率，这些都需要我们去不断学习。

查看全文

相关阅读:
Tensorflow2.0基础
 Tensorflow2.0多层感知机实现mnist手写数字识别
 numpy数组的维度操作和axis的对应关系
 jupyter notebook使用
 darknet批量测试并保存图片
 darknet训练自身数据集的小问题
 PIL批量更改图片格式及bat/cmd文件批量修改文件后缀名
 cv::Mat用法
 VS配置opencv、cuda及调用yolo动态链接库
 VS之 32 or 64

原文地址：https://www.cnblogs.com/ddzj01/p/14901047.html