zoukankan      html  css  js  c++  java
  • Indexing GROUP BY

    SQL databases use two entirely different group by algorithms. The first one, the hash algorithm, aggregates the input records in a temporary hash table. Once all input records are processed, the hash table is returned as the result. The second algorithm, the sort/group algorithm, first sorts the input data by the grouping key so that the rows of each group follow each other in immediate succession. Afterwards, the database just needs to aggregate them. In general, both algorithms need to materialize an intermediate state, so they are not executed in a pipelined manner. Nevertheless the sort/group algorithm can use an index to avoid the sort operation, thus enabling a pipelined group by.

     
     
    Consider the following query. It delivers yesterday's revenue grouped by PRODUCT_ID:
    SELECT product_id, sum(eur_value)
      FROM sales
     WHERE sale_date = TRUNC(sysdate) - INTERVAL '1' DAY
     GROUP BY product_id

    Knowing the index on SALE_DATE and PRODUCT_ID from the previous section, the sort/group algorithm is more appropriate because an INDEX RANGE SCAN automatically delivers the rows in the required order. That means the database avoids materialization because it does not need an explicit sort operation—the group by is executed in a pipelined manner.

    oracle:
    ---------------------------------------------------------------
    |Id |Operation                    | Name        | Rows | Cost |
    ---------------------------------------------------------------
    | 0 |SELECT STATEMENT             |             |   17 |  192 |
    | 1 | SORT GROUP BY NOSORT        |             |   17 |  192 |
    | 2 |  TABLE ACCESS BY INDEX ROWID| SALES       |  321 |  192 |
    |*3 |   INDEX RANGE SCAN          | SALES_DT_PR |  321 |    3 |
    ---------------------------------------------------------------
    The Oracle database's execution plan marks a pipelined SORT GROUP BY operation with the NOSORT addendum. The execution plan of other databases does not mention any sort operation at all. 
     
    The pipelined group by has the same prerequisites as the pipelined order by, except there are no ASC and DESC modifiers. That means that defining an index with ASC/DESC modifiers should not affect pipelined group by execution. The same is true for NULLS FIRST/LAST. Nevertheless there are databases that cannot properly use an ASC/DESC index for a pipelined group by.
     
    For PostgreSQL, you must add an order by clause to make an index with NULLS LAST sorting usable for a pipelined group by. The Oracle database cannot read an index backwards in order to execute a pipelined group by that is followed by an order by. More details are available in the respective appendices: PostgreSQLOracle.

    If we extend the query to consider all sales since yesterday, as we did in the example for the pipelined order by, it prevents the pipelined group by for the same reason as before: the INDEX RANGE SCAN does not deliver the rows ordered by the grouping key.

    SELECT product_id, sum(eur_value)
      FROM sales
     WHERE sale_date >= TRUNC(sysdate) - INTERVAL '1' DAY
     GROUP BY product_id
    Oracle:
    ---------------------------------------------------------------
    |Id |Operation                    | Name        | Rows | Cost |
    ---------------------------------------------------------------
    | 0 |SELECT STATEMENT             |             |   24 |  356 |
    | 1 | HASH GROUP BY               |             |   24 |  356 |
    | 2 |  TABLE ACCESS BY INDEX ROWID| SALES       |  596 |  355 |
    |*3 |   INDEX RANGE SCAN          | SALES_DT_PR |  596 |    4 |
    ---------------------------------------------------------------

    Instead, the Oracle database uses the hash algorithm. The advantage of the hash algorithm is that it only needs to buffer the aggregated result, whereas the sort/group algorithm materializes the complete input set. In other words: the hash algorithm needs less memory.

    As with pipelined order by, a fast execution is not the most important aspect of the pipelined group by execution. It is more important that the database executes it in a pipelined manner and delivers the first result before reading the entire input. 

    参考:

    http://use-the-index-luke.com/sql/sorting-grouping/indexed-group-by

  • 相关阅读:
    使用react native制作的一款网络音乐播放器
    swift3.0 简单直播和简单网络音乐播放器
    深入理解iOS开发中的BitCode功能
    react native 之 事件监听 和 回调函数
    swift简单处理调用高清大图导致内存暴涨的情况
    swift3.0 自定义键盘
    iOS原生和React-Native之间的交互2
    react native 之 获取键盘高度
    React Native项目集成iOS原生模块
    架构篇 | 带你轻松玩转 LAMP 网站架构平台(一)
  • 原文地址:https://www.cnblogs.com/xiaotengyi/p/7229262.html
Copyright © 2011-2022 走看看