原文地址
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:735625536552
You Asked
We are doing performance-testing of SQL-statements in our project. When running explain-plans on the statements, the results are often not realistic due to the fact that we have not loaded the tables with data, and therefore the indexes, in most cases, will not be used as they will in production. I found the DBMS_STATS-package and am wondering if I can use this package to virtually load my test-environment when using the explain-plan. Our Oracle-support team doubt that it will be possible to extract the data using GET/EXPORT_TABLE/INDEX/COLUMN_STATS, modifying the data and then using SET/IMPORT_TABLE/INDEX/COLUMN_STATS to put new statistics back into the data-dictionary-tables. Are there any descriptions on the tables the CREATE_STAT_TABLE makes? Regards, Harald
and we said...
That table structure is not documented and is not designed to be modified by anything other then the DBMS_STATS package. Its sole purpose is to take stats from one database to another via export/import in the dbms_stats package. Attempting to plug your own values in there would be difficult at best and potentially harmful at worst. Instead of trying to insert/update rows in that table -- you should just use the other entries in the dbms_stats package. For example, to set the statistics for a table -- you would use the dbms_stats.set_table_stats procedure. It can either put the stats you want into the data dictionary directly OR you can have this procedure stuff those stats in the table created by CREATE_STAT_TABLE. You use the formal named arguments in the set_table_stats procedure and they'll take care of stuffing them into the table itself. You lose NO functionality by doing this -- there would be NOTHING to be gained by accessing the table created directly. You would use the SET_COLUMN|INDEX|TABLE_STATS to set whatever stats you wanted on the data. There would be no reason to modify existing data. You can see all of the "existing" data by querying the data dictionary.
Is it used to set statistics and test the performance i.e. the execution path. Actually, We are interested in knowing how we can use and where we can use dbms_stats.set_table_stats to the maximum benefit.
Followup September 29, 2001 - 2pm Central time zone:
You can use it to transfer the stats from your production system (with all of the data) to your test system. You might do that to review the query plans generated by new/altered queries in your application before putting them into production -- so you can see what might be different. I use it to show the effects of having differeing numbers of rows/blocks in tables. For example, in order to demonstrate a point -- I might analyze an empty table -- run a query and show the plan. Then, I will use dbms_stats to trick the optimizer into thinking there are LOTS and LOTS of rows and blocks in there and how the different plans. You can use this to see how the optimizer will change its mind about the proper plan over time. Another use I've found for it is with temporary tables. You cannot really analyze them but what you can do is create a "real" table -- fill it up with representative data -- and then export these stats using dbms_stats. Later, you create the temporary table and import these other stats you saved. Now the optimizer will use these stats for the temporary table when developing query plans that involve it. This could be important if you put lots of rows into a temp table and then use it in subsquent SQL statments -- the optimizer would otherwise think the temporary table is "small"
-----------------------------------------------------------------------------------------------------
Hi I have tested a few times and always find that dbms_stats is slower than ordinary analyze even specifying parallel degree, why is that? we are using 8.1.6.3 Also is it possible analyze a table using analyze table xx compute statistics in parallel?
Followup November 1, 2001 - 4pm Central time zone:
Geez, no test case, no parameters, not even "we find it X% slower" -- nothing. Perhaps the tables you used were trivial in size (parallel can take much longer). Perhaps you were comparing apples with oranges (eg: dbms_stats by default does "for all columns size 1" -- did your analyze do that too?) No, you cannot analyze in parallel, only dbms_stats can. Also, I do not see it being significantly slower (given that dbms_stats will just issue an analyze in this particular case you would expect that to be so!): ops$tkyte@ORA717DEV.US.ORACLE.COM> variable n number ops$tkyte@ORA717DEV.US.ORACLE.COM> ops$tkyte@ORA717DEV.US.ORACLE.COM> analyze table t delete statistics; Table analyzed. ops$tkyte@ORA717DEV.US.ORACLE.COM> ops$tkyte@ORA717DEV.US.ORACLE.COM> exec :n := dbms_utility.get_time PL/SQL procedure successfully completed. ops$tkyte@ORA717DEV.US.ORACLE.COM> exec dbms_stats.gather_table_stats( user, 'T', method_opt => 'for all indexed columns', cascade => true ); PL/SQL procedure successfully completed. ops$tkyte@ORA717DEV.US.ORACLE.COM> ops$tkyte@ORA717DEV.US.ORACLE.COM> exec dbms_output.put_line( (dbms_utility.get_time-:n) || ' hsecs ' ); 22762 hsecs PL/SQL procedure successfully completed. ops$tkyte@ORA717DEV.US.ORACLE.COM> ops$tkyte@ORA717DEV.US.ORACLE.COM> analyze table t delete statistics; Table analyzed. ops$tkyte@ORA717DEV.US.ORACLE.COM> ops$tkyte@ORA717DEV.US.ORACLE.COM> exec :n := dbms_utility.get_time PL/SQL procedure successfully completed. ops$tkyte@ORA717DEV.US.ORACLE.COM> ops$tkyte@ORA717DEV.US.ORACLE.COM> analyze table t 2 compute statistics 3 for table 4 for all indexed columns 5 for all indexes 6 / Table analyzed. ops$tkyte@ORA717DEV.US.ORACLE.COM> ops$tkyte@ORA717DEV.US.ORACLE.COM> exec dbms_output.put_line( (dbms_utility.get_time-:n) || ' hsecs ' ); 22732 hsecs PL/SQL procedure successfully completed. ops$tkyte@ORA717DEV.US.ORACLE.COM> select num_Rows from user_tables where table_name = 'T'; NUM_ROWS ---------- 1093312 and when I do it in parallel -- I see a HUGE difference: ops$tkyte@ORA717DEV.US.ORACLE.COM> variable n number ops$tkyte@ORA717DEV.US.ORACLE.COM> ops$tkyte@ORA717DEV.US.ORACLE.COM> analyze table t delete statistics; Table analyzed. ops$tkyte@ORA717DEV.US.ORACLE.COM> ops$tkyte@ORA717DEV.US.ORACLE.COM> exec :n := dbms_utility.get_time PL/SQL procedure successfully completed. ops$tkyte@ORA717DEV.US.ORACLE.COM> exec dbms_stats.gather_table_stats( user, 'T', method_opt => 'for all indexed columns size 1', cascade => true, degree=>8 ); PL/SQL procedure successfully completed. ops$tkyte@ORA717DEV.US.ORACLE.COM> ops$tkyte@ORA717DEV.US.ORACLE.COM> exec dbms_output.put_line( (dbms_utility.get_time-:n) || ' hsecs ' ); 2525 hsecs PL/SQL procedure successfully completed. I think you were using tables that were so small as to be insignificant in even timing them. Btw, table t is: create table t as select * from all_objects; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; create index t_idx1 on t(object_id); create index t_idx2 on t(object_name);