zoukankan      html  css  js  c++  java
  • 初试PL/SQL并行编程

    -----------------------------Cryking原创------------------------------
    -----------------------转载请注明出处,谢谢!------------------------ 

    并行处理能大大提高数据的处理速度,它依赖于硬件资源、网络资源等环境.

    并行处理的硬件资源环境分为5大类:

    1.传统的单台计算机、单处理器、单内核的机器.(无法进行并行处理,但是并行程序还是可以运行的,此时就和普通程序一样了)

    2.单台计算机、单处理器、多内核的机器.(目前本人的测试环境,没钱买多处理器的快哭了)

    3.单台计算机、多处理器、多内核的机器.

    4.集群系统

    5.分布式系统

    PS:如何查看几个处理器、几核,常用的软件就是CPU-Z.

    本人的测试机器CPU:

    单CPU,双核心处理器

    下面开始,编写一个普通的管道函数(这里演示的就是管道函数的并行处理):

    CREATE OR REPLACE FUNCTION pipe_test
    (
      c_empno SYS_REFCURSOR
     ,p       VARCHAR2 DEFAULT ','
    ) RETURN EMP_element
      PIPELINED IS
      v_element VARCHAR2(1000);
    BEGIN
      FETCH c_empno
        INTO v_element;
      LOOP
        EXIT WHEN c_empno%NOTFOUND;
        PIPE ROW(substr(v_element, 0, instr(v_element, p) - 1));
        LOOP
          v_element := substr(v_element
                             ,instr(v_element, p) + 1
                             ,length(v_element) - instr(v_element, p) + 1);
          EXIT WHEN instr(v_element, p) = 0;
          PIPE ROW(substr(v_element, 0, instr(v_element, p) - 1));
        END LOOP;
        PIPE ROW(v_element);
        FETCH c_empno
          INTO v_element;
      END LOOP;
      RETURN;
    END pipe_test;



    注意需要先建立类型EMP_element,如下:

    CREATE OR REPLACE TYPE EMP_element as table of varchar2(100);

    然后我们建立一个并行处理的管道函数:

    CREATE OR REPLACE FUNCTION parallel_test
    (
      p_empno SYS_REFCURSOR
     ,p       VARCHAR2 DEFAULT ','
    ) RETURN EMP_element
      PIPELINED
      PARALLEL_ENABLE(PARTITION p_empno BY ANY) IS
      v_element VARCHAR2(1000);
    BEGIN
      FETCH p_empno
        INTO v_element;
      LOOP
        EXIT WHEN p_empno%NOTFOUND;
        PIPE ROW(substr(v_element, 0, instr(v_element, p) - 1));
        LOOP
          v_element := substr(v_element
                             ,instr(v_element, p) + 1
                             ,length(v_element) - instr(v_element, p) + 1);
          EXIT WHEN instr(v_element, p) = 0;
          PIPE ROW(substr(v_element, 0, instr(v_element, p) - 1));
        END LOOP;
        PIPE ROW(v_element);
        FETCH p_empno
          INTO v_element;
      END LOOP;
      RETURN;
    END parallel_test;


    留意下启用并行的关键字PARALLEL_ENABLE.

    接下来我们构造大表进行测试,根据v$pq_sesstat视图的结果,我们来判断数据库系统是否真的进行了并行处理.

    12:57:52 SYS@orcl> create table big_data as select 'Cc' a,'Dd' b from dual connect by level<1000000;
    
    
    表已创建。
    
    已用时间:  00: 00: 01.92
    12:59:00 SYS@orcl> select count(*) from big_data;
    
      COUNT(*)
    ----------
        999999
    
    已选择 1 行。
    
    已用时间:  00: 00: 00.31
    13:47:08 SYS@orcl> set autot trace exp stat
    --普通管道函数耗时40.48秒
    13:48:42 SYS@orcl> select SCOTT.pipe_test(CURSOR(SELECT a||','||b from big_data)) FROM DUAL;
    
    已选择 1 行。
    
    已用时间:  00: 00: 40.48
    
    执行计划
    ----------------------------------------------------------
    Plan hash value: 2452824241
    
    ------------------------------------------------------------------------------------------------
    | Id  | Operation                          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                   |           |     1 |       |     2   (0)| 00:00:01 |
    |   1 |  VIEW                              |           |  8168 |   414K|    29   (0)| 00:00:01 |
    |   2 |   COLLECTION ITERATOR PICKLER FETCH| PIPE_TEST |  8168 |       |    29   (0)| 00:00:01 |
    |   3 |  FAST DUAL                         |           |     1 |       |     2   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------------------
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
    
    
    统计信息
    ----------------------------------------------------------
        1000515  recursive calls
              0  db block gets
        1000325  consistent gets
           1548  physical reads
              0  redo size
        6027769  bytes sent via SQL*Net to client
           1214  bytes received via SQL*Net from client
             10  SQL*Net roundtrips to/from client
              7  sorts (memory)
              0  sorts (disk)
              1  rows processed
    
    13:53:24 SYS@orcl> alter system flush shared_pool;
    
    系统已更改。
    
    已用时间:  00: 00: 01.43
    --使用启用并行的管道函数,耗时40.32,与普通的管道函数耗时差不多
    --奇怪的是执行计划了多了一个BIG_DATA的全表扫描操作了
    13:53:26 SYS@orcl> select SCOTT.parallel_test(CURSOR(SELECT a||','||b from big_data)) FROM DUAL;
    
    已选择 1 行。
    
    已用时间:  00: 00: 40.32
    
    执行计划
    ----------------------------------------------------------
    Plan hash value: 1682567826
    
    ----------------------------------------------------------------------------------------------------
    
    | Id  | Operation                          | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
    
    ----------------------------------------------------------------------------------------------------
    
    |   0 | SELECT STATEMENT                   |               |     1 |       |     2   (0)| 00:00:01 |
    
    |   1 |  VIEW                              |               |  8168 |   414K|    29   (0)| 00:00:01 |
    
    |   2 |   COLLECTION ITERATOR PICKLER FETCH| PARALLEL_TEST |  8168 |       |    29   (0)| 00:00:01 |
    
    |   3 |    TABLE ACCESS FULL               | BIG_DATA      |   968K|  7565K|   422   (3)| 00:00:06 |
    
    |   4 |  FAST DUAL                         |               |     1 |       |     2   (0)| 00:00:01 |
    
    ----------------------------------------------------------------------------------------------------
    
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
    
    
    统计信息
    ----------------------------------------------------------
        1000867  recursive calls
              0  db block gets
        1000353  consistent gets
           1560  physical reads
              0  redo size
        6026847  bytes sent via SQL*Net to client
           1006  bytes received via SQL*Net from client
              8  SQL*Net roundtrips to/from client
              9  sorts (memory)
              0  sorts (disk)
              1  rows processed
    --通过v$pq_sesstat视图我们发现此时系统并没有进行并行处理,(其实通过执行计划也可看出没有使用并行)
    13:54:21 SYS@orcl> select * from v$pq_sesstat;
    
    STATISTIC                      LAST_QUERY SESSION_TOTAL
    ------------------------------ ---------- -------------
    Queries Parallelized                    0             0
    DML Parallelized                        0             0
    DDL Parallelized                        0             0
    DFO Trees                               0             0
    Server Threads                          0             0
    Allocation Height                       0             0
    Allocation Width                        0             0
    Local Msgs Sent                         0             0
    Distr Msgs Sent                         0             0
    Local Msgs Recv'd                       0             0
    Distr Msgs Recv'd                       0             0
    
    已选择11行。
    
    已用时间:  00: 00: 00.00
    		  
    13:54:40 SYS@orcl> alter system flush buffer_cache;
    
    系统已更改。
    
    已用时间:  00: 00: 00.00
    13:55:27 SYS@orcl> alter system flush shared_pool;
    
    系统已更改。
    
    已用时间:  00: 00: 01.40
    --接下来我们加HINT,强制并行处理,此时耗时25.03秒,降低了一半
    13:55:30 SYS@orcl> select /*+ parallel */scott.PARALLEL_TEST(CURSOR(SELECT a||','||b from big_data))
     FROM DUAL;
    
    已选择 1 行。
    
    已用时间:  00: 00: 25.03
    
    执行计划
    ----------------------------------------------------------
    Plan hash value: 3127237831
    
    -----------------------------------------------------------------------------------------------------------------------------------
    
    | Id  | Operation                            | Name          | Rows  | Bytes | Cost (%CPU)| Time |    TQ  |IN-OUT| PQ Distrib |
    
    -----------------------------------------------------------------------------------------------------------------------------------
    
    |   0 | SELECT STATEMENT                     |               |     1 |       |     2   (0)| 00:00:01 |        |      |            |
    
    |   1 |  PX COORDINATOR                      |               |       |       |            | |        |      |            |
    
    |   2 |   PX SEND QC (RANDOM)                | :TQ10000      |  8168 |   414K|    16   (0)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
    
    |   3 |    VIEW                              |               |  8168 |   414K|    16   (0)| 00:00:01 |  Q1,00 | PCWP |            |
    
    |   4 |     COLLECTION ITERATOR PICKLER FETCH| PARALLEL_TEST |  8168 |       |    16   (0)| 00:00:01 |  Q1,00 | PCWP |            |
    
    |   5 |      PX BLOCK ITERATOR               |               |   968K|  7565K|   234   (3)| 00:00:03 |  Q1,00 | PCWC |            |
    
    |   6 |       TABLE ACCESS FULL              | BIG_DATA      |   968K|  7565K|   234   (3)| 00:00:03 |  Q1,00 | PCWP |            |
    
    |   7 |  FAST DUAL                           |               |     1 |       |     2   (0)| 00:00:01 |        |      |            |
    
    -----------------------------------------------------------------------------------------------------------------------------------
    
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
       - automatic DOP: Computed Degree of Parallelism is 2
    
    
    统计信息
    ----------------------------------------------------------
        1000941  recursive calls
              4  db block gets
        1000516  consistent gets
           1842  physical reads
              0  redo size
        6026847  bytes sent via SQL*Net to client
           1006  bytes received via SQL*Net from client
              8  SQL*Net roundtrips to/from client
              9  sorts (memory)
              0  sorts (disk)
              1  rows processed
    --上面的执行计划及v$pq_sesstat都显示系统启用了并行处理,并行度为2
    13:57:44 SYS@orcl> select * from v$pq_sesstat;
    
    STATISTIC                      LAST_QUERY SESSION_TOTAL
    ------------------------------ ---------- -------------
    Queries Parallelized                    0             1
    DML Parallelized                        0             0
    DDL Parallelized                        0             0
    DFO Trees                               0             1
    Server Threads                          0             0
    Allocation Height                       0             0
    Allocation Width                        0             0
    Local Msgs Sent                         0           553
    Distr Msgs Sent                         0             0
    Local Msgs Recv'd                       0           553
    Distr Msgs Recv'd                       0             0
    
    已选择11行。
    
    已用时间:  00: 00: 00.01

    为了使示例更加有说服力,我们再来试一下普通管道函数+HINT强制并行看看:

    16:02:36 SYS@orcl> select /*+ parallel */SCOTT.pipe_test(CURSOR(SELECT a||','||b from big_data)) FROM DUAL;
    
    已选择 1 行。
    
    已用时间:  00: 00: 43.34
    
    执行计划
    ----------------------------------------------------------
    Plan hash value: 2452824241
    
    ------------------------------------------------------------------------------------------------
    | Id  | Operation                          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                   |           |     1 |       |     2   (0)| 00:00:01 |
    |   1 |  VIEW                              |           |  8168 |   414K|    16   (0)| 00:00:01 |
    |   2 |   COLLECTION ITERATOR PICKLER FETCH| PIPE_TEST |  8168 |       |    16   (0)| 00:00:01 |
    |   3 |  FAST DUAL                         |           |     1 |       |     2   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------------------
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
       - automatic DOP: Computed Degree of Parallelism is 1
    
    
    统计信息
    ----------------------------------------------------------
        1003093  recursive calls
             18  db block gets
        1000973  consistent gets
           1628  physical reads
              0  redo size
        6026843  bytes sent via SQL*Net to client
           1006  bytes received via SQL*Net from client
              8  SQL*Net roundtrips to/from client
            100  sorts (memory)
              0  sorts (disk)
              1  rows processed
    

    由于普通函数不支持并行,所以即使+HINT强制并行,系统还是没有使用并行处理.耗时依然为40多秒.

    如果不使用函数,使用普通的SQL,又是可以使用HINT强制并行的,如下:

    15:59:07 SYS@orcl> select /*+ parallel */ a||','||b from big_data;
    
    已选择999999行。
    
    已用时间:  00: 00: 04.09
    
    执行计划
    ----------------------------------------------------------
    Plan hash value: 2638980575
    
    --------------------------------------------------------------------------------------------------------------
    | Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
    --------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT     |          |   968K|  7565K|   234   (3)| 00:00:03 |        |      |         |
    |   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |         |
    |   2 |   PX SEND QC (RANDOM)| :TQ10000 |   968K|  7565K|   234   (3)| 00:00:03 |  Q1,00 | P->S | QC (RAND)  |
    |   3 |    PX BLOCK ITERATOR |          |   968K|  7565K|   234   (3)| 00:00:03 |  Q1,00 | PCWC |         |
    |   4 |     TABLE ACCESS FULL| BIG_DATA |   968K|  7565K|   234   (3)| 00:00:03 |  Q1,00 | PCWP |         |
    --------------------------------------------------------------------------------------------------------------
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
       - automatic DOP: Computed Degree of Parallelism is 2
    
    
    统计信息
    ----------------------------------------------------------
              6  recursive calls
              0  db block gets
           1577  consistent gets
           1520  physical reads
              0  redo size
       13467035  bytes sent via SQL*Net to client
         733742  bytes received via SQL*Net from client
          66668  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
         999999  rows processed
    





    通过上面的示例可以看到,并行处理可以大大加速数据的处理,几乎成倍的提升性能.

    但并不是并行度为几,就会提示几倍的性能,它受硬件环境,优化器内部算法,PL/SQL引擎解析等等的约束.

    总之,并行处理给我们带来的性能提升是很大的,所以在某些场景使用并行编程也是很必要的.

  • 相关阅读:
    超值干货:微服务架构下如何解耦,对于已经紧耦合下如何重构?
    程序员收藏不看系列:近三万字总结Spring注解开发!
    干货收藏:6 款能挣钱的 Spring Boot 开源后台管理系统
    美团二面:你向 Mysql 数据库插入 100w 条数据用了多久?
    5分钟快速掌握阿里内部MySQL性能优化的核心技术!
    优秀!一鼓作气学会“一致性哈希”,就靠这 18 张图了
    分库分表神器 Sharding-JDBC,几千万的数据你不搞一下?
    熬夜肝出5大点,18张图带你彻底弄懂MySQL事务日志
    jdk8新特性Stream
    java多线程
  • 原文地址:https://www.cnblogs.com/snake-hand/p/3159640.html
Copyright © 2011-2022 走看看