zoukankan      html  css  js  c++  java
  • dbms_stats.gather_table_stats与analyze table 的区别[转贴]

    Analyze Statement
    The ANALYZE statement can be used to gather statistics for a specific table, index or cluster. The statistics can be computed exactly, or estimated based on a specific number of rows, or a percentage of rows:

    ANALYZE TABLE employees COMPUTE STATISTICS;
    ANALYZE INDEX employees_pk COMPUTE STATISTICS;

    ANALYZE TABLE employees ESTIMATE STATISTICS SAMPLE 100 ROWS;
    ANALYZE TABLE employees ESTIMATE STATISTICS SAMPLE 15 PERCENT;
    DBMS_UTILITY
    The DBMS_UTILITY package can be used to gather statistics for a whole schema or database. Both methods follow the same format as the analyze statement:

    EXEC DBMS_UTILITY.analyze_schema('SCOTT','COMPUTE');
    EXEC DBMS_UTILITY.analyze_schema('SCOTT','ESTIMATE', estimate_rows => 100);
    EXEC DBMS_UTILITY.analyze_schema('SCOTT','ESTIMATE', estimate_percent => 15);

    EXEC DBMS_UTILITY.analyze_database('COMPUTE');
    EXEC DBMS_UTILITY.analyze_database('ESTIMATE', estimate_rows => 100);
    EXEC DBMS_UTILITY.analyze_database('ESTIMATE', estimate_percent => 15);
    DBMS_STATS
    The DBMS_STATS package was introduced in Oracle 8i and is Oracles preferred method of gathering object statistics. Oracle list a number of benefits to using it including parallel execution, long term storage of statistics and transfer of statistics between servers. Once again, it follows a similar format to the other methods:

    EXEC DBMS_STATS.gather_database_stats;
    EXEC DBMS_STATS.gather_database_stats(estimate_percent => 15);

    EXEC DBMS_STATS.gather_schema_stats('SCOTT');
    EXEC DBMS_STATS.gather_schema_stats('SCOTT', estimate_percent => 15);

    EXEC DBMS_STATS.gather_table_stats('SCOTT', 'EMPLOYEES');
    EXEC DBMS_STATS.gather_table_stats('SCOTT', 'EMPLOYEES', estimate_percent => 15);

    EXEC DBMS_STATS.gather_index_stats('SCOTT', 'EMPLOYEES_PK');
    EXEC DBMS_STATS.gather_index_stats('SCOTT', 'EMPLOYEES_PK', estimate_percent => 15);
    This package also gives you the ability to delete statistics:

    EXEC DBMS_STATS.delete_database_stats;
    EXEC DBMS_STATS.delete_schema_stats('SCOTT');
    EXEC DBMS_STATS.delete_table_stats('SCOTT', 'EMPLOYEES');
    EXEC DBMS_STATS.delete_index_stats('SCOTT', 'EMPLOYEES_PK');

    --------------------------------------------------------------------------------------

    自从Oracle8.1.5引入dbms_stats包,Experts们便推荐使用dbms_stats取代analyze。 理由如下

    dbms_stats可以并行分析
    dbms_stats有自动分析的功能(alter table monitor )
    analyze 分析统计信息的不准确some times


    1,2好理解,且第2点实际上在VLDB中是最吸引人的;3以前比较模糊,看了metalink236935.1 解释,analyze在分析Partition表的时候,有时候会计算出不准确的Global statistics .

    原因是,dbms_stats会实在的去分析表全局统计信息(当指定参数);而analyze是将表分区(局部)的statistics 汇总计算成表全局statistics ,可能导致误差。

    如果想分析整个用户或数据库,还可以采用工具包,可以并行分析
    Dbms_utility(8i以前的工具包)
    Dbms_stats(8i以后提供的工具包)

    dbms_stats.gather_schema_stats(User,estimate_percent=>100,cascade=> TRUE);
    dbms_stats.gather_table_stats(User,TableName,degree => 4,cascade => true);

    这是对命令与工具包的一些总结

    1、对于分区表,建议使用DBMS_STATS,而不是使用Analyze语句。
    a) 可以并行进行,对多个用户,多个Table
    b) 可以得到整个分区表的数据和单个分区的数据。
    c) 可以在不同级别上Compute Statistics:单个分区,子分区,全表,所有分区
    d) 可以倒出统计信息
    e) 可以用户自动收集统计信息

    2、DBMS_STATS的缺点
    a) 不能Validate Structure
    b) 不能收集CHAINED ROWS, 不能收集CLUSTER TABLE的信息,这两个仍旧需要使用Analyze语句。
    c) DBMS_STATS 默认不对索引进行Analyze,因为默认Cascade是False,需要手工指定为True

    3、对于oracle 9里面的External Table,Analyze不能使用,只能使用DBMS_STATS来收集信息。

    -----------------------------------------------------------------
    10G的文档是这么说的:
    Do not use the COMPUTE and ESTIMATE clauses of ANALYZE to collect optimizer statistics. These clauses are supported for backward compatibility. Instead, use the DBMS_STATS package, which lets you collect statistics in parallel, collect global statistics for partitioned objects, and fine tune your statistics collection in other ways. The cost-based optimizer, which depends upon statistics, will eventually use only statistics that have been collected by DBMS_STATS

    analyze的功能已经明确:
    Use the ANALYZE statement (rather than DBMS_STATS) forstatistics collection not related to the cost-based optimizer:

    To use the VALIDATE or LIST CHAINED ROWS clauses

    To collect information on freelist blocks

    在收集与CBO优化器不相关的统计信息的时候ANALYZE语句要优于DBMS_STATS包

    -----------------------------------
    EX:
    begin
    for owner in (select username from dba_users where username not in ('SYS','SYSTEM'))
    loop
    dbms_output.disable;
    dbms_output.enable(1000000);
    dbms_output.put_line('Schema: '||owner.username);
    select sysdate into start_time from dual;
    dbms_output.put_line('Analyze start from : '||start_time);
    dbms_stats.gather_schema_stats(ownname => owner.username, estimate_percent => 20, block_sample=> true, cascade=>true);
    select sysdate into end_time from dual;
    dbms_output.put_line('Analyze complete at : '||end_time);
    dbms_output.put_line('---------------------------');
    end loop;


    dbms_stats.gather_table_stats(ownname =>,
    tabname =>,
    partname =>,
    estimate_percent =>,
    block_sample =>,
    method_opt =>,
    degree =>,parallel degree(并行收集维度) 看CPU个数
    granularity =>,
    cascade =>,true is also gather columns and index’s statistics;
    no_invalidate =>);

    ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

    当索引整个块成为空块的时候,会被放置于freelist上,可以被重用。但是重用前结构上还是临时放置于B-Tree中。

    这时analyze & dbms_stats在分析时会有不同,analyze会将这些空的block计算为leaf block,而dbms_stats不将其计算在内。

    请看例子

    1  create table test
    2  as
    3  select rownum x
    4* from dba_objects
    SQL> /

    Table created.

    SQL> select count(*) from test;

    COUNT(*)
    ———-
    6114

    1  create index test_idx on test(x)
    2* pctfree 95
    SQL> /

    Index created.

    SQL> analyze table test compute statistics;

    Table analyzed.

    SQL> select index_name,blevel,leaf_blocks,distinct_keys,num_rows,last_analyzed  from dba_indexes where index_name=’TEST_IDX’ and wner=’SYS’;

    INDEX_NAME                         BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS LAST_ANALYZED
    —————————— ———- ———– ————- ———- —————–
    TEST_IDX                                1         408          6114       6114 20060513 01:00:25

    SQL> delete from test where x<3000;

    2999 rows deleted.

    SQL> commit;

    Commit complete.

    SQL> analyze index test_idx compute statistics;

    Index analyzed.

    SQL> select index_name,blevel,leaf_blocks,distinct_keys,num_rows,last_analyzed
    2  from dba_indexes where index_name=’TEST_IDX’ and wner=’SYS’;

    INDEX_NAME                         BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS LAST_ANALYZED
    —————————— ———- ———– ————- ———- —————–
    TEST_IDX                                1         408          3115       3115 20060513 01:03:31

    SQL> exec DBMS_STATS.GATHER_INDEX_STATS(’SYS’,'TEST_IDX’);

    PL/SQL procedure successfully completed.

    SQL>  select index_name,blevel,leaf_blocks,distinct_keys,num_rows,last_analyzed
    2   from dba_indexes where index_name=’TEST_IDX’ and wner=’SYS’;

    INDEX_NAME                         BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS LAST_ANALYZED
    —————————— ———- ———– ————- ———- —————–
    TEST_IDX                                1         209          3115       3115 20060513 01:04:28

    FROM:http://www.dbafan.com/blog/?p=21

    ////////////////////////////////////////////////////////////////////////////////////////////////////////////

    我们知道从oracle8i开始﹐analyze语句和dbms_stats包都可以收集相关对象(TablesIndexesClusters and Materialized Views)statistics。那哪些statistics的收集应该使用analyze语句﹐哪些statistics的收集应该使用dbms_stats包呢﹖

    对于使用哪个去收集statistics﹐应该把一个原则﹐凡是与cost-based optimizer相关的statistics﹐都应通过dbms_stats包收集。与cost-based optimizer无关的statistics(empty blocksaverage space)都应通过analyze语句去收集。

    之所以要用dbms_stats包去替代analyze收集优化器statistics﹐是因为dbms_stats包能收集并行statistics和分区对象的全局statistics

    当然analyze语句在其它方面的statistics收集﹐是dbms_stats所无法取代的﹐如﹕

    1. 收集在freelist上的blocks信息

    2. 检验存储格式的合法性

    analyze table bk_test_t validate structure cascade online;

    3. 识别表或cluster的行迁移与行链接

    为了能使用analyze….list chained rows语句识别行迁移与行链接﹐必须先在执行analyze语句所在的schema内执行$ORACLE_HOME/rdbms/admin/utlchain.sql(utlchn1.sql)脚本建立chained_rows表。在chained_rows建立之后﹐就可以执行下面的语句﹕

    analyze table bk_test list chained rows into chained_rows;

  • 相关阅读:
    Uploader 文件上传
    filters过滤器的使用
    Calendar中遇到的问题
    中科院之旅
    Python基础教程:列表推导式详解
    不会也要知道的,Python四种实现排序的方法
    2021字节跳动校招秋招算法面试真题解题报告--leetcode19 删除链表的倒数第 n 个结点,内含7种语言答案
    2021字节跳动校招秋招算法面试真题解题报告--leetcode206 反转链表,内含7种语言答案
    求协方差
    国外卡组织的 交换费-interchangefee(发卡行服务费) 和 银联对比
  • 原文地址:https://www.cnblogs.com/hllnj2008/p/4787057.html
Copyright © 2011-2022 走看看