zoukankan      html  css  js  c++  java
  • 组合索引怎么应该怎么选取引导列?

     
    有这样一个SQL
    select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT';
    id列选择性很高,owner选择性很低
    要优化它很简单,只需要在t1表上建立一个组合索引(owner,id),在t2表上建立一个索引(id)
    现在要讨论的是我们应该怎么建立组合索引,哪一列(owner,id)应该放在最前面?
    
    现在来做个实验
    
    SQL> desc t1
    Name        Type          Nullable Default Comments 
    ----------- ------------- -------- ------- -------- 
    ID          NUMBER        Y                         
    OBJECT_NAME VARCHAR2(128) Y                         
    OWNER       VARCHAR2(30)  Y                         
    
    SQL> desc t2
    Name      Type        Nullable Default Comments 
    --------- ----------- -------- ------- -------- 
    ID        NUMBER      Y                         
    STATUS    VARCHAR2(7) Y                         
    TEMPORARY VARCHAR2(1) Y    
    
    
    SQL> create index inx_id on t2(id);
    
    Index created.
    
    SQL> create index inx_id_owner on t1(id,owner);
    
    Index created.
    
    SQL> select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT';
    
    Elapsed: 00:00:00.02
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2432674005
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation              | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT       |              |     1 |    16 |    88   (2)| 00:00:02 |
    |   1 |  SORT AGGREGATE        |              |     1 |    16 |            |          |
    |*  2 |   HASH JOIN            |              |  2416 | 38656 |    88   (2)| 00:00:02 |
    |*  3 |    INDEX FAST FULL SCAN| INX_ID_OWNER |  2416 | 26576 |    50   (0)| 00:00:01 |
    |   4 |    INDEX FAST FULL SCAN| INX_ID       | 72475 |   353K|    37   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("T1"."ID"="T2"."ID")
       3 - filter("T1"."OWNER"='SCOTT')
    
    
    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
            392  consistent gets
              0  physical reads
              0  redo size
            422  bytes sent via SQL*Net to client
            420  bytes received via SQL*Net from client
              2  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
              1  rows processed
                          
    SQL> create index inx_owner_id on t1(owner,id);
    
    Index created.
    
    SQL> select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT';
    
    Elapsed: 00:00:00.03
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 277464349
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation              | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT       |              |     1 |    16 |    47   (3)| 00:00:01 |
    |   1 |  SORT AGGREGATE        |              |     1 |    16 |            |          |
    |*  2 |   HASH JOIN            |              |  2416 | 38656 |    47   (3)| 00:00:01 |
    |*  3 |    INDEX RANGE SCAN    | INX_OWNER_ID |  2416 | 26576 |     9   (0)| 00:00:01 |
    |   4 |    INDEX FAST FULL SCAN| INX_ID       | 72475 |   353K|    37   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("T1"."ID"="T2"."ID")
       3 - access("T1"."OWNER"='SCOTT')
    
    
    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
            169  consistent gets
              0  physical reads
              0  redo size
            422  bytes sent via SQL*Net to client
            420  bytes received via SQL*Net from client
              2  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
              1  rows processed
    
    如果OWNER作为引导列,逻辑读由以前的392变成现在的169,并且由以前的 index fast full scan 变成index range scan
    
    如果强制指定走索引 inx_id_owner 
              
    SQL>  select /*+ index(t1 inx_id_owner) */ count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT';
    
    Elapsed: 00:00:00.03
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 3161475902
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation              | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT       |              |     1 |    16 |   259   (1)| 00:00:04 |
    |   1 |  SORT AGGREGATE        |              |     1 |    16 |            |          |
    |*  2 |   HASH JOIN            |              |  2416 | 38656 |   259   (1)| 00:00:04 |
    |*  3 |    INDEX FULL SCAN     | INX_ID_OWNER |  2416 | 26576 |   221   (1)| 00:00:03 |
    |   4 |    INDEX FAST FULL SCAN| INX_ID       | 72475 |   353K|    37   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("T1"."ID"="T2"."ID")
       3 - access("T1"."OWNER"='SCOTT')
           filter("T1"."OWNER"='SCOTT')
    
    
    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
            387  consistent gets
              0  physical reads
              0  redo size
            422  bytes sent via SQL*Net to client
            420  bytes received via SQL*Net from client
              2  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
              1  rows processed
    
    依然要387个逻辑读。
    
    为什么要以owner为引导列?因为ID是join列,并且ID列上面没过滤条件,如果以ID列作为引导列,由于没过滤条件
    那么CBO只能走 index full scan,或者index fast full scan,因为引导列没过滤条件,走不了index range scan,
    最多走index skip scan,不过index skip scan代价过高,因为index skip scan要求 引导列选择性很低,但是ID这里选择性很高
    
    SQL> select /*+ index_ss(t1 inx_id_owner) */ count(*) from t1,t2 where t1.id=t2.id and t1.owner='SYS';
    
    Elapsed: 00:00:00.10
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 3493079762
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation              | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT       |              |     1 |    16 | 72529   (1)| 00:14:31 |
    |   1 |  SORT AGGREGATE        |              |     1 |    16 |            |          |
    |*  2 |   HASH JOIN            |              |  2416 | 38656 | 72529   (1)| 00:14:31 |
    |*  3 |    INDEX SKIP SCAN     | INX_ID_OWNER |  2416 | 26576 | 72491   (1)| 00:14:30 |
    |   4 |    INDEX FAST FULL SCAN| INX_ID       | 72475 |   353K|    37   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("T1"."ID"="T2"."ID")
       3 - access("T1"."OWNER"='SYS')
           filter("T1"."OWNER"='SYS')
    
    
    Statistics
    ----------------------------------------------------------
              1  recursive calls
              0  db block gets
            387  consistent gets
              0  physical reads
              0  redo size
            424  bytes sent via SQL*Net to client
            420  bytes received via SQL*Net from client
              2  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
              1  rows processed
    
    如果owner列作为引导列,那么优化器就可以选择index range scan,这样相比index full scan, index fast full scan
    肯定要少扫描很多leaf block,逻辑读就会相对较少。
    
    其实到这里,是否可以总结一下建立组合索引的原则呢?
    引导列要选择过滤条件的列作为引导列,比如 where a.xxx='xxx' 或者 a.xxx> 或者 a.xxx<
    引导列的选择性越高越好,因为选择性越高,扫描的leaf block就越少,效率就越高
    尽量把join列放到组合索引最后面
    
    这里选择以owner列作为引导列,由于owner选择性很低,所以测试索引压缩对于性能的提升
    
    SQL> analyze  index inx_owner_id validate structure;
    
    Index analyzed.
    
    SQL> select height,
      2         blocks,
      3         lf_blks,
      4         br_blks,
      5         OPT_CMPR_COUNT,
      6         OPT_CMPR_PCTSAVE
      7    from index_stats
      8   where name = 'INX_OWNER_ID';
    
        HEIGHT     BLOCKS    LF_BLKS    BR_BLKS OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
    ---------- ---------- ---------- ---------- -------------- ----------------
             2        256        219          1              1               26
     
    SQL> drop index inx_owner_id;
    
    Index dropped
    
    SQL> create index inx_owner_id on t1(owner,id) compress 1;
    
    Index created
    
    SQL> analyze  index inx_owner_id validate structure;
    
    Index analyzed
    
    SQL> 
    SQL> select height,
      2         blocks,
      3         lf_blks,
      4         br_blks,
      5         OPT_CMPR_COUNT,
      6         OPT_CMPR_PCTSAVE
      7    from index_stats
      8   where name = 'INX_OWNER_ID';
    
        HEIGHT     BLOCKS    LF_BLKS    BR_BLKS OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
    ---------- ---------- ---------- ---------- -------------- ----------------
             2        256        161          1              1                0
             
    索引压缩之后,Leaf block 由原来的219降低到161个,节约了58个block 现在再来看一看执行计划+统计信息
    
    SQL> select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT';
    
    Elapsed: 00:00:00.03
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 277464349
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation              | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT       |              |     1 |    16 |    45   (3)| 00:00:01 |
    |   1 |  SORT AGGREGATE        |              |     1 |    16 |            |          |
    |*  2 |   HASH JOIN            |              |  2416 | 38656 |    45   (3)| 00:00:01 |
    |*  3 |    INDEX RANGE SCAN    | INX_OWNER_ID |  2416 | 26576 |     7   (0)| 00:00:01 |
    |   4 |    INDEX FAST FULL SCAN| INX_ID       | 72475 |   353K|    37   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("T1"."ID"="T2"."ID")
       3 - access("T1"."OWNER"='SCOTT')
    
    
    Statistics
    ----------------------------------------------------------
              1  recursive calls
              0  db block gets
            169  consistent gets
              0  physical reads
              0  redo size
            422  bytes sent via SQL*Net to client
            420  bytes received via SQL*Net from client
              2  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
              1  rows processed
    
    由此可见,索引压缩之后,逻辑读并没有下降,还是169,但是索引的leaf blcok显著减少了,这样减少了存储空间,能降低物理IO      

  • 相关阅读:
    [Jenkins] 批量删除构建历史
    [Android Tips] 32. 解决 Android Device Monitor 在 Mac OS X 卡住
    [Gradle] 查看项目依赖
    [Gradle] 获取 gradle 命令行参数
    [Android Tips] 31.如何将第三库引入的 Permission 删除掉
    [Gradle] 发布构件到本地仓库
    [Android Tips] 30.如何在 Android Studio 中一次性格式化所有代码
    下载安卓应用的历史版本
    设置int、float型数据的输出格式
    头文件limits—各个类型的数据的范围
  • 原文地址:https://www.cnblogs.com/hehe520/p/6330568.html
Copyright © 2011-2022 走看看