zoukankan      html  css  js  c++  java
  • How to Write A Better SQL Statement in Oracle

    工作之余总结了下写条“好”的SQL语句需要注意的一些地方,肯定不是很全面也不是很深刻,以后有想到的继续补足吧....

    1.   Overview 

    Yes, wring a SQL statement is a no-brainer. But writing anefficient SQL statement is not that easy as you might think of.  A badly-written SQL statement will consume alot of resources and get you waiting for a long time before receiving anyresponse from the database server, while a good SQL statement will make yourlife much happier.

    This document is intended to provide some tips for writing alittle better SQL statement in Oracle. Please note that the tips covered inthis document are definitely not all-inclusive.

    2.   SQL Code Style

    Just as any programming language, a good code style is veryimportant, SQL statement is no exception. A short SQL statement might not lookbad if you don’t keep good code style, while a long SQL statement will lookterrible if you don’t follow consistent format style.  Besides, a more important fact is that aconsistent good SQL statement format style can make your SQL statement runfaster! As you may know, a SQL statement will be parsed (hard parse/soft parse)before been executed. If your SQL statement is of consistent style, you willreduce parsing times. As Oracle SQL engine might deem the “same” SQL statementsyou think as different ones.

    Take a look at the following example,

    SQL 1:

    select enamefrom emp;

    SQL 2:

    select ENAME from emp;

    The difference between these two SQL statements is that thecolumn ename has difference character cases. Though these two SQL statementsperform the same action, Oracle will deem them as totally different ones which inreturn introduce an extra SQL parse process.

     OK, enough for theimportance of good style. Below is a recommendation SQL statement style,

    -         Write all SQL keywords in upper case

    -         Write all non SQL keywords in lower case

    Example:

    SELECT ename FROM emp WHERE depno=1;

    This is just for recommendation. No matter what conversionyou follow, the most important is keeping stick to it.

    3.   Optimizer Mode

    There are two optimizer modes in Oracle CBO (cost basedoptimizer): ALL_ROWS and FIRST_ROWS. ALL_ROWS will instruct the optimizer togenerate a SQL plan which is throughput oriented, while FIRST_ROWS will directthe optimizer to base its decisions on the goal of the best response time. Thedefault optimizer mode is ALL_ROWS which is suitable for most of the situationas we’d like to get all the SQL query results with the minimum resources costgenerally. However, if sometimes you want to get the first N records (like top1000 rows) quickly, FIRST_ROWS will be a good choice.

    You can set the optimizer mode in the level of instance,session and SQL statement. Since instance and session level will impact most ofthe SQL statements, I recommend using SQL statement level optimizer mode,that’s to say use Oracle hint in the SQL statement.

    Example:

    SELECT /*+ first_rows(10)*/ ename FROM emp WHEREdepno=1;

    4.   Better SELECT Clauses

    This section will focus on the SQL statement in the SELECTsection.

    4.1. Use Hints

    The reason why put hints in this section is because oraclehint is usually written in the SELECT clause.
    Though Oracle CBO is very intelligent to generate optimized SQL executionplan under most circumstances, there arestill some situations to be manually tuned to change the execution plan.

    In section 3, we talked about oracle hints all_rows andfirst_rows(n) which are used to direct optimizer to generate different executionplan based on different goals. Below are some commonly-used oracle hints thatyou might find helpful when writing SQL statement,

    • Table scan related: index, full
    • Table join method: use_hash, use_nl, use_merge
    • Table join order: ordered, leading

    Please refer to Oracle online document for a detailed list.

    4.2. Do NOT Use SELECT *

    Thought it is quite convenient to use “select *” to retrievethe columns from the table, it is really a bad idea.  Some consequences of this practice are writtenbelow,

    • More network traffic involved. If you only needto retrieve a few columns from the table, use “select *” will introduce lots ofoverhead transferring the data from the database server to client.
    • Code is not easy to understand as the reader hasto figure out which columns you are really interested in.

    4.3. Use table alias before thecolumn name

    Use table alias will make the SQL statement more easilyunderstood by others if there are more than one table in the FROM clause.What’s more, explicitly telling the optimizer which tables the columns comefrom will somewhat eases the workload of the optimizer. 

    Another important fact about the table alias is that Oraclehint will only take the table alias instead of table name into account if youspecify the table alias in the FROM clause. That’s to say, using table name inthe Oracle hint may not take effect. So get yourself used to take advantage ofthe table alias as much as possible.

    5.   Better WHERE Clauses

    “WHERE clause” is almost the most import part in a SQL statementfrom the perspective of performance because the selective criteria in the WHEREclauses can dramatically decrease the amount of data Oracle has to considerduring a query. Careful specification of WHERE conditions can have asignificant bearing on whether optimizer will choose existing indexes.

    5.1. Avoid using HAVING clauses whereverpossible

    Use the WHERE clause instead of the HAVING clause whereverpossible. The WHERE clause restricts the number of rows retrieved at the outsetwhile HAVING clause will retrieve a lot more rows than necessary.

    Example:

    The following SQL statement…

    SELECT e.deptno, count(e.empno)

    FROM scott.emp e

    GROUP BY e.deptno

    HAVINGe.deptno = 20;

    Should be rewritten to…

    SELECT e.deptno, count(e.empno)

    FROM scott.emp e

    WHERE e.deptno= 20

    GROUP BY e.deptno;

    5.2. Avoid using function orcalculation on indexed columns

    This is actually very obvious. If you use the function onthe indexed columns, the index (function-based index is exception) will not beused by the optimizer, a full table scan will be performed instead.

    Example 1,

    Suppose there is an index on the column report_date on thetable test, the following SQL statement…

    SELECT t.column

    FROM test t

    WHERE trun(t.report_date) = trunc(sysdate);

    Should be rewritten to the following one…

    SELECT t.column

    FROM test t

    WHERE t.report_date >= trunc(sysdate)

    AND   t.report_date <trunc(sydate + 1);

    Besides, avoid calculating on the indexed columns.

    Example 2:

    SELECT e.empno

    FROM scott.emp e

    WHERE sal * 12 > 100000;

    Should be rewritten as…

    SELECT e.empno

    FROM scott.emp e

    WHERE sal > 100000 / 12;

    5.3. Avoid implicit SQL rewritten onindexed columns

    Sometimes even you don’t use SQL function or performcalculation on the indexed columns; the indexes can still be possibly not usedby the optimizer. 

    Take a look at the following example,

    SELECT ta.attrib_name

    FROM lo_prod_type_attribs ta

    WHERE ta.product_type_id like (NVL(to_char(:vv_product_type_id), '%'))

    The above SQL statement takes in a bind variable vv_product_type_idto decide how to retrieve the table lo_prod_type_attribs. If the vv_product_type_idis NULL, the SQL statement will retrieve all the rows in the tablelo_product_type_attribs, otherwise it will only retrieve the row specified bythe variable vv_product_type_id.  ThisSQL statement is concise and beauty, but it will cause the index on the columnproduct_type_id unusable. Why? Because NVL function requires the possiblevalues returned to be of the same type! Since ‘%’ is a string, we need toconvert the variable vv_product_type_id to string using the function to_char.The consequence of this action will cause the optimizer to rewrite the WHEREclause of the above SQL statement implicitly to…

    WHERE to_char(ta.product_type_id) like(NVL(to_char(:vv_product_type_id), '%'))

    Please note the function to_char is applied on the indexed columnta.product_type_id by the optimizer.

    5.4. Use the right joins method andjoins order

    -         Avoid Cartesian joins. Pay attention to whetherthere is a WHERE clause.

    -         Try to use equi joins wherever possible becauseequi join leads to a more efficient query plan.

    -         Perform filtering operations early to reduce thenumber of rows to be joined in later steps. Sometimes using a sub-query tofilter the result set before performing a join is a good idea.

    -         Join in the order that will produce the leastnumber of rows as output to the parent step.

    6.   Other Tips

    6.1. Use bind variables

    The parsing stage of query processing consumes resources andhence you should parse just once for the “reusable” SQL statements. Use bindvariables in SQL statements instead of literal values will reduce the amount ofparsing in the database. Please note that bind variables should be alsoidentical in terms of their name, data type and length. Failure to use bindvariables leads to heavy use of the shared pool and contention of the latches.

    For example, instead of writing a query like…

    SELECT e.deptno, count(e.empno)

    FROM scott.emp e

    WHERE e.deptno= 20

    GROUP BY e.deptno;

    You’d better rewrite it as

    SELECT e.deptno, count(e.empno)

    FROM scott.emp e

    WHERE e.deptno= :deptno

    GROUP BY e.deptno;

    This way, if you go on to query the department 30, 40, etc;the second SQL statement will be reused.

    6.2. Choose between UNION ALL andUNION

    More often than not, you will run into the situation ofcombining several SQL result sets into a final result set. Under suchcircumstances, you might use UNION ALL or UNION to combine the result sets. Thechoice you made between these two operations will lead to different results,huge differences most of the time.

    The difference between UNION ALL and UNION is that theformer will not perform data sort operation while the latter one will do. Asyou may know, sort is an expensive operation, especially when the data set isvery large.  So you should considerwhether you need the result set to be sorted or not before making the decision.

    6.3. Avoid DISTINCT operation usingEXISTS or IN

    Sometimes, you may need to use DISTINCT to remove theduplicate records from the result set. DISTINCT will induce sort operation, soyou may want to avoid using DISTINCT as much as possible.

    There are some circumstances under which the DISTINCT can beavoided, see below an example,

    SELECT DISTINCT d.deptno, d.dname

    FROM dept d, emp e

    WHERE d.deptno = e.deptno;

    IF you look carefully at this SQL statement, you will noticethat the SELECT clause only references table dept. It is the join between thetable dept and table emp that leads to the duplicated records. Under suchcircumstances, we can take advantage of EXIST or IN to avoid the DISINTCToperation.

    SELECT d.deptno, d.dname

    FROM dept d

    WHERE EXISTS

    (SELECT NULL FROM emp e WHERE d.deptno = e.deptno);

    SELECT d.deptno, d.dname

    FROM dept d

    WHERE d.deptno IN (SELECT e.deptno FROM emp e);

    6.4. Choose between (NOT) EXISTS and (NOT)IN

    EXISTS and IN are usually used in the sub-query to probe theouter query. Most of the time, IN and EXISTS are interchangeable. Oraclerecommends using the IN clause if the sub-query has the selective WHERE clause.If the parent query contains the selective WHERE clause, use the EXISTS clauserather than the IN clause.

    The most important different between NOT EXISTS and NOT INis that NOT IN will generate “unexpected” result sometimes.

    Take a look at the following example,

     SELECTe.empno, e.ename

    FROM emp e

    WHERE e.deptno

    NOT IN (SELECT d.deptno FROM dept d);

    This SQL statement is to retrieve all the employees whosedepartment is not in the table dept. This SQL statement will work properlyunder most circumstances. However, if the table dept has a NULL deptno, theresults of the query will be “surprising”! The reason behind is that NULL is not equal to anything. To get theright result, you may need to turn to NOT EXISTS for help,

    SELECT e.empno, e.ename

    FROM emp e

    WHERE NOT EXISTS (SELECT NULL FROM dept d WHEREe.deptno = d.deptno);

    Besides, from the perspective of performance, NOT EXISTS isbetter than NOT IN.  So therecommendation is that don’t use NOT IN.

    6.5. Make the most use of CASEstatement

    When you need to calculate multiple aggregates from the sametable, avoid writing a separate query for each aggregate. With separatequeries, Oracle has to read the entire table for each query. It’s moreefficient to use the CASE statement in this case, as it enables you to computemultiple aggregates from the table with just a single read of the table.

    For example, the following three 3 SQL statements calculatethe number of the employees for different salary range,

    SELECT COUNT(*)

    FROM emp

    WHERE sal < 2000;

    SELECT COUNT(*)

    FROM emp

    WHERE sal BETWEEN 2000 AND 4000;

    SELECT COUNT(*)

    FROM emp

    WHERE sal > 4000;

    The problem with this approach is that 3 table scans isperformed instead of one. It is more efficient to run the entire query in asingle SQL statement, like below,

    SELECT

    COUNT(CASE WHEN sal < 2000 THEN 1 ELSE NULL END)count_1,

    COUNT(CASE WHEN sal between 2001 AND 4000 THEN 1 ELSENULL END) count_2,

          COUNT(CASEWHEN sal > 4000 THEN 1 ELSE NULL END) count_3

    FROM emp;

    CASE statement can also be used in UPDATE statement toperform multiple update (conditional update) in one SQL statement. Like below,

    UPDATE test_table

    SET dept =

    CASE

      WHEN dept='b' THEN 'c'

      WHEN dept='c' THEN 'b'

      ELSE dept

    END

    WHERE dept IN ('b','c');

    6.6. Use WITH clause to improve queryperformance

    If a lengthy query has multiple references to a singlesub-query block, you can take advantage of WITH clause to reduce the amount ofthe sub-query blocks and hence the performance is improved.

    Take a look at the following example,

    SELECT channel_desc,channel_total
    FROM

    (SELECTc.channel_desc, SUM(s.amount_sold) AS channel_total

     FROM sales s, channels c

     WHERE s.channel_id = c.channel_id

     GROUP BY c.channael_desc

    )
    WHERE channel_total >
         (SELECT SUM(channel_total) * 1/3
          FROM(SELECTc.channel_desc, SUM(s.amount_sold) AS channel_total

               FROM sales s, channels c

               WHERE s.channel_id = c.channel_id

               GROUP BY c.channael_desc

          ));

    You can see that the above SQL statement have two samesub-query blocks to calculate the total amount sold for each channel. In thiscase, you can make use of WITH clause to factor out this sub-query block, likebelow,

    WITH channel_summary AS

    (SELECTc.channel_desc, SUM(s.amount_sold) AS channel_total

     FROM sales s, channels c

     WHERE s.channel_id = c.channel_id

     GROUP BY c.channael_desc)

    SELECT channel_desc,channel_total
    FROM channel_summary
    WHERE channel_total > (SELECT SUM(channel_total) * 1/3 FROMchannel_summary );




    --------------------------------------
    Regards,
    FangwenYu
  • 相关阅读:
    逆向获取博客园APP代码
    Cooperation.GTST团队第一周项目总结
    关于Cooperation.GTST
    个人博客:ccatom.com
    Jmeter初步使用三--使用jmeter自身录制脚本
    Jmeter初步使用二--使用jmeter做一个简单的性能测试
    Jmeter初步使用--Jmeter安装与使用
    测试悖论
    百万级数据量比对工作的一些整理
    性能测试流程
  • 原文地址:https://www.cnblogs.com/fangwenyu/p/1897396.html
Copyright © 2011-2022 走看看