Understanding Access Paths for the Query Optimizer

zoukankan html css js c++ java

Understanding Access Paths for the Query Optimizer
参考资料：Oracle Database Performance Tuning Guide, 10g Release 2 (10.2).pdf：265页

参考网址：http://www.dbaroad.me/?p=4862

  http://www.dbaroad.me/archives/2009/06/index_full_scan.html(常见索引扫描浅析（三）：INDEX FULL SCAN)

  http://wenku.baidu.com/view/4bce2240be1e650e52ea99c9.html(Oracle优化-英文官方文档)

  http://blog.csdn.net/tianlesoftware/article/details/7816502 (Oracle TABLE ACCESS BY INDEX ROWID 说明)

  http://blog.csdn.net/tianlesoftware/article/details/5852106 (Oracle 索引扫描的五种类型)

  http://blog.csdn.net/tianlesoftware/article/details/5826546 （多表连接的三种方式详解 HASH JOIN MERGE JOIN NESTED LOOP）

  http://blog.csdn.net/tianlesoftware/article/details/4707900 （Oracle 高水位(HWM: High Water Mark) 说明）

声明:上面所列网址为参考学习的网址,如果有侵犯知识版权，请通知本人，本人将即刻停止侵权行为！

本文中引用的执行计划范例：

SQL语句：
1 EXPLAIN PLAN FOR 2 SELECT E.EMPLOYEE_ID, 3 J.JOB_TITLE, 4 E.SALARY, 5 D.DEPARTMENT_NAME 6 FROM EMPLOYEES E, 7 JOBS J, 8 DEPARTMENTS D 9 WHERE E.EMPLOYEE_ID < 103 10 AND E.JOB_ID = J.JOB_ID 11 AND E.DEPARTMENT_ID = D.DEPARTMENT_ID;
执行计划：

Understanding Access Paths for the Query Optimizer

Access paths are ways in which data is retrieved from the database.（访问路径是指数据从数据库中的查询方式） In general, index access paths should be used for statements that retrieve a small subset of table rows, while full scans are more efficient when accessing a large portion of the table. Online transaction processing (OLTP) applications, which consist of short-running SQL statements with high selectivity, often are characterized by the use of index access paths. Decision support systems, on the other hand, tend to use partitioned tables and perform full scans of the relevant partitions.

一般情况下，当从数据库查询表中行数的很小的子集的时候，应该使用Index Aacess，但是，当从数据库中获取一个表中，比例很大的数据时候，Full Scan变得更加有效。

  This section describes the data access paths that can be used to locate and retrieve any row in any table.

■ Full Table Scans

  ■ Rowid Scans

  ■ Index Scans

  ■ Cluster Access(哈希聚簇读取)-http://blog.csdn.net/edwingu/article/details/6633346

  ■ Hash Access

  ■ Sample Table Scans

■ How the Query Optimizer Chooses an Access Path

Full Table Scans

This type of scan reads all rows from a table and filters out those that do not meet the selection criteria. During a full table scan, all blocks in the table that are under the high water mark are scanned. The high water mark indicates the amount of used space, or space that had been formatted to receive data. Each row is examined to determine whether it satisfies the statement's WHERE clause.

全表扫描从表中读取全部的数据，并且过滤掉那些不能满足选择条件的行。在全表扫表的时候，该表中被标记在高水位（high water）的所有数据块将被扫描。high water-表明被该表使用的空间数量，或者已经被格式化的用来接受数据的空间。(High Water参考-http://wenku.baidu.com/view/dbc9f30b4a7302768e993941.html)，每一行数据都将用来检查是否满足where语句中的条件。

When Oracle performs a full table scan, the blocks are read sequentially. Because the blocks are adjacent, I/O calls larger than a single block can be used to speed up the process. The size of the read calls range from one block to the number of blocks indicated by the initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT. Using multiblock reads means a full table scan can be performed very efficiently. Each block is read only once.

当Oracle执行全表扫表的时候，所有的数据库将被顺序的读取，因为数据是相邻的，I/O call大于1个数据块可以用来加快处理的过程。每次读取数据块的大小从1个到在处理参数DB_FILE_MULTIBLOCK_READ_COUNT之间进行变化，使用多数据块的读取，性能将会变得更加有效，每一个数据块仅仅读取一次。

"EXPLAIN PLAN Output"例子中 contains an example of a full table scan on the employees table.

Why a Full Table Scan Is Faster for Accessing Large Amounts of Data

为什么在访问大部分的数据的时候，全表扫表会更快-

Full table scans are cheaper than index range scans when accessing a large fraction of the blocks in a table. This is because full table scans can use larger I/O calls, and making fewer large I/O calls is cheaper than making many smaller calls.

在需要访问一个表中的大部分的(比例比较大)的数据块的时候，全表扫表的代价要小于索引扫描的代价。因为全表扫/描使用大的I/O calls，使用少量的大的I/O calls代价比使用多的小的I/O calls小。

When the Optimizer Uses Full Table Scans

The optimizer uses a full table scan in any of the following cases：

Lack of Index If the query is unable to use any existing indexes, then it uses a full table scan. For example, if there is a function used on the indexed column in the query, the optimizer is unable to use the index and instead uses a full table scan.

如果有一个函数作用在索引列上，那么优化器将不会使用所以扫描，而使用全表扫描。

（如何写出高效的SQL语句，请参考Developing Efficient SQL Statements-223页）

If you need to use the index for case-independent searches, then either do not permit mixed-case data in the search columns or create a function-based index, such as UPPER(last_name), on the search column

请参考-Using Function-based Indexes for Performance-309页
Oracle Database Concepts, 10g Release 2 (10.2).pdf -Function-Based Indexes:136页

如果需要使用索引进行大小写的搜索，或者在搜索列中不允许出现大小写混合的值或者为搜索列建立一个函数索引，例如UPPER(列名)。

Large Amount of Data If the optimizer thinks that the query will access most of the blocks in the table, then it uses a full table scan, even though indexes might be available。

如果优化器认为查询语句将需要访问表中的大部分语句快，那么他将会使用全表扫描，即使当前存在所在索引可以使用。

Small Table If a table contains less than DB_FILE_MULTIBLOCK_READ_COUNT blocks under the high water mark(The high water mark is the boundary between used and unused space in a segment.), which can be read in a single I/O call, then a full table Understanding Access Paths for the Query Optimizer scan might be cheaper than an index range scan, regardless of the fraction of tables being accessed or indexes present.

-------

一个表的数据块的个数等……

Oracle中的段

参考网址：http://hi.baidu.com/longredhao/item/b07034e0da9cd91d595dd8ea

表空间--->区--->段--->块

用到的表：
SQL> desc dba_extents; Name Type Nullable Default Comments --------------- ------------ -------- ------- --------------------------------------------------------- OWNER VARCHAR2(30) Y Owner of the segment associated with the extent SEGMENT_NAME VARCHAR2(81) Y Name of the segment associated with the extent PARTITION_NAME VARCHAR2(30) Y Partition/Subpartition Name, if any, of the segment SEGMENT_TYPE VARCHAR2(18) Y Type of the segment TABLESPACE_NAME VARCHAR2(30) Y Name of the tablespace containing the extent EXTENT_ID NUMBER Y Extent number in the segment FILE_ID NUMBER Y Name of the file containing the extent BLOCK_ID NUMBER Y Starting block number of the extent BYTES NUMBER Y Size of the extent in bytes BLOCKS NUMBER Y Size of the extent in ORACLE blocks RELATIVE_FNO NUMBER Y Relative number of the file containing the segment header SQL> DESC dba_segments; Name Type Nullable Default Comments --------------- ------------ -------- ------- -------------------------------------------------------------------------------------------------------------------------------------- OWNER VARCHAR2(30) Y Username of the segment owner SEGMENT_NAME VARCHAR2(81) Y Name, if any, of the segment PARTITION_NAME VARCHAR2(30) Y Partition/Subpartition Name, if any, of the segment SEGMENT_TYPE VARCHAR2(18) Y Type of segment: "TABLE", "CLUSTER", "INDEX", "ROLLBACK", "DEFERRED ROLLBACK", "TEMPORARY","SPACE HEADER", "TYPE2 UNDO" or "CACHE" TABLESPACE_NAME VARCHAR2(30) Y Name of the tablespace containing the segment HEADER_FILE NUMBER Y ID of the file containing the segment header HEADER_BLOCK NUMBER Y ID of the block containing the segment header BYTES NUMBER Y Size, in bytes, of the segment BLOCKS NUMBER Y Size, in Oracle blocks, of the segment EXTENTS NUMBER Y Number of extents allocated to the segment INITIAL_EXTENT NUMBER Y Size, in bytes, of the initial extent of the segment NEXT_EXTENT NUMBER Y Size, in bytes, of the next extent to be allocated to the segment MIN_EXTENTS NUMBER Y Minimum number of extents allowed in the segment MAX_EXTENTS NUMBER Y Maximum number of extents allowed in the segment PCT_INCREASE NUMBER Y Percent by which to increase the size of the next extent to be allocated FREELISTS NUMBER Y Number of process freelists allocated in this segment FREELIST_GROUPS NUMBER Y Number of freelist groups allocated in this segment RELATIVE_FNO NUMBER Y Relative number of the file containing the segment header BUFFER_POOL VARCHAR2(7) Y The default buffer pool to be used for segments blocks
数据库中DBA_SEGMENTS表描述了数据库中的每一个段，数据库中有很多的段类型，例如：
1 SELECT DISTINCT a.segment_type FROM dba_segments a
其中，常用的是TABLE和INDEX

查询段的一些信息
1 SELECT t.tablespace_name 表空间名字, 2 t.segment_name 段名字, 3 t.segment_type 段类型, 4 t.extent_id 区间编号, 5 t.file_id 文件编号, 6 t.block_id 第一个数据块编号, 7 t.blocks 数据块个数, 8 t.bytes 区间大小 9 10 from dba_extents t WHERE t.owner = 'SCOTT'
执行结果：

可以通过一下语句来查询区的一些信息：
1 SELECT t.tablespace_name 表空间, 2 t.header_file 包含段头部信息的文件号, 3 t.header_block 包含段头部信息的数据块号, 4 t.segment_name 区间名字, 5 t.segment_type 区间类型, 6 t.extents 区间个数, 7 t.blocks 块个数, 8 t.bytes / 1024 || 'KB' 区间大小 9 FROM dba_segments t 10 WHERE t.owner LIKE '%SCOTT%'
统计区的信息
1 SELECT owner 所有者, 2 SUM(bytes) / 1024 / 1024 AS 大小, 3 SUM(blocks) AS 总数据块个数, 4 SUM(extents) AS 总区间个数 5 FROM dba_segments 6 GROUP BY owner 7 HAVING owner = 'SCOTT'
查看数据库中指定的数据块的大小:单位是BYTE,默认的一个数据区包括八个数据块：64K
1 SQL> show parameter db_block_size; 2 3 NAME TYPE VALUE 4 ------------------------------------ ----------- ------------------------------ 5 db_block_size integer 8192
+++++++++++++++++++++++++++++++++
一下内容参考至：

http://blog.csdn.net/tianlesoftware/article/details/4707900 （Oracle 高水位(HWM: High Water Mark) 说明）

可以使用ALL_TABLES或者USER_TABLES来查询表的分析时间和表包含的数据块的个数。
SELECT table_name,num_rows,blocks,empty_blocks FROM user_tables WHERE table_name='TABLE_NAME'
也可以通过
1 SELECT segment_name,segment_type,blocks FROM dba_segments WHERE segment_name='TABLE_NAME';
注意：如果第二个语句查询存在值，但是第一个语句，中，num_rows,blocks,empty_block为空，表明该表没有被分析过，即如果user_table为空，那么由于该表没有进行过统计分析!

可以使用语句
1 exec DBMS_STATS.GATHER_TABLE_STATS('scheme_name','table_name');
来进行统计分析

在使用了上面的语句对某个表进行了分析之后，USER_TABLE中将存在值，但是empty_blocks 还是为空。这里要注意的地方。这个字段只有使用analyze 收集统计信息之后才会有数据。

使用下面的语句计算空的数据块个数：
1 ANALYZE TABLE TABLE_NAME COMPUTE STATISTICS;
-------

Rowid Scans

The rowid of a row specifies the datafile and data block containing the row and the location of the row in that block. Locating a row by specifying its rowid is the fastest way to retrieve a single row, because the exact location of the row in the database is specified.

rowid指明了包含所指行在数据文件和数据块的地址。通过指定rowid是获取一行数据最快的方式，因为行在数据库中的地址被明确的指出。

To access a table by rowid, Oracle first obtains the rowids of the selected rows, either from the statement's WHERE clause or through an index scan of one or more of the table's indexes. Oracle then locates each selected row in the table based on its rowid. In Example 13–2, "EXPLAIN PLAN Output" on page 13-12, an index scan is performed the jobs and departments tables. The rowids retrieved are used to return the row data.

When the Optimizer Uses Rowids

This is generally the second step after retrieving the rowid from an index. The table access might be required for any columns in the statement not present in the index.

通常是在通过index获取了rowid之后，开始使用rowid，所查询的列不再索引中，也会使用rowid去查询。

Access by rowid does not need to follow every index scan. If the index contains all the
columns needed for the statement, then table access by rowid might not occur.

在每个所以扫描之后，通过rowid去访问行i记录并不是必须的，这是因为索引已经包含了全部所需要的列，那么通过rowid去访问行记录将不会出现。（因为在所以中即存储rowid的值，也存储了所以的值）

注意：

Rowids are an internal Oracle representation of where data is stored. They can change between versions. Accessing data based on position is not recommended, because rows can move around due to row migration and chaining and also after export and import.

rowid代表的是数据在数据库内部存储的地址，在不同的数据库版本之间，它们可能会发生变化，通过rowid来获取行记录不推荐使用的方式。因为行可能会发生行的合并和移动，并且在export和import命令后，它们的rowid也可能发生变化。

Index Scans

In this method, a row is retrieved by traversing the index, using the indexed column values specified by the statement. An index scan retrieves data from an index based on the value of one or more columns in the index. To perform an index scan, Oracle searches the index for the indexed column values accessed by the statement. If the statement accesses only columns of the index, then Oracle reads the indexed column values directly from the index, rather than from the table.

如果，查询只包含索引中的列，oracle直接从索引中取出列的值，而不是从表中查询出。

The index contains not only the indexed value, but also the rowids of rows in the table having that value. Therefore, if the statement accesses other columns in addition to the indexed columns, then Oracle can find the rows in the table by using either a table access by rowid or a cluster scan.

索引中不仅包含索引的值（索引列的值），还包含包含索引值的行的rowid(物理地址)，因此，如果查询语句查询的列除了索引列之外还有其他的列，oracle可以通过rowid查询表或者cluster scan来获取行记录。

An index scan can be one of the following types:
　　■ Assessing I/O for Blocks, not Rows
　　■ Index Unique Scans
　　■ Index Range Scans
　　■ Index Range Scans Descending
　　■ Index Skip Scans
　　■ Full Scans
　　■ Fast Full Index Scans
　　■ Index Joins
　　■ Bitmap Indexes

Assessing I/O for Blocks, not Rows

Oracle does I/O by blocks. Therefore, the optimizer's decision to use full table scans is influenced by the percentage of blocks accessed, not rows. This is called the index clustering factor. If blocks contain single rows, then rows accessed and blocks accessed
are the same.

oracle通过块来操作I/O,因而，优化器决定使用全表扫描是由需要查询的块的比例来决定，而不是行，这被称作-index clustering factor（索引集群因素），但是如果一个块包含一行数据，那么通过行和块读取的效果是一致的。

However, most tables have multiple rows in each block. Consequently, the desired number of rows could be clustered together in a few blocks, or they could be spread out over a larger number of blocks.

大部分表，在一个数据块中都包含多行记录，因此，理想的行数是他们可以被聚集在少量的数据块中，反之，他们可能会分散在很多的数据块中。

Although the clustering factor is a property of the index, the clustering factor actually relates to the spread of similar indexed column values within data blocks in the table. A lower clustering factor indicates that the individual rows are concentrated within fewer blocks in the table. Conversely, a high clustering factor indicates that the individual rows are scattered more randomly across blocks in the table. Therefore, a high clustering factor means that it costs more to use a range scan to fetch rows by rowid, because more blocks in the table need to be visited to return the data. Example 13–3 shows how the clustering factor can affect cost.

尽管clustering factor（集群因素）是所以的一个属性，clustering factor实际上和在表中的数据块中的索引值得的相似度的分布性有关。一个低的clustering factor表明特性的行被集中在少数在表中的数据块中，相反的，高的clustering factor表明独特的行被随机的分布在该表中的数据块中。因此，搞的clustering factor意味着当使用range scan（区间扫描）通过rowid去读取行将会代价会更高，因为更多的数据块需要被读取，来返回数据。以下是具体的例子：
I believe that we are who we choose to be. Nobody‘s going to come and save you, you‘ve got to save yourself. 我相信我们成为怎样的人是我们自己的选择。没有人会来拯救你，你必须要自己拯救自己。
查看全文

相关阅读:
C++泛型函数及模版类
 android逆向入门及工具下载
 排序算法之交换排序
 索尼法则=？职场法则
 2014年5月20日---一个值得纪念的日子
 C#的委托是什么？
物联网RFID安全研究
 [转]nmap使用方法
 [转]中间人攻击-ARP毒化
 15019:Only the instance admin may alter the PermSize attribute

原文地址：https://www.cnblogs.com/caroline/p/2661292.html

Understanding Access Paths for the Query Optimizer

Understanding Access Paths for the Query Optimizer

Full Table Scans

Why a Full Table Scan Is Faster for Accessing Large Amounts of Data

When the Optimizer Uses Full Table Scans

Rowid Scans

When the Optimizer Uses Rowids

Index Scans

Assessing I/O for Blocks, not Rows