zoukankan      html  css  js  c++  java
  • postgresql 10 的并行(parallel)简介

    os:centos 6.8
    db:postgresql 10.3

    postgresql 10 只支持 range、list 分区。

    select version()
    
    PostgreSQL 10.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18), 64-bit

    涉及到参数

    select *
    from pg_settings ps
    where 1=1
    and ps.name like '%dynamic_shared_memory_type%'
    ;
    
    select *
    from pg_settings ps
    where 1=1
    and ps.name in (
    'force_parallel_mode',
    'max_worker_processes',
    'max_parallel_workers',
    'max_parallel_workers_per_gather',
    'min_parallel_relation_size',
    'min_parallel_index_scan_size',
    'min_parallel_table_scan_size',
    'parallel_tuple_cost',
    'parallel_setup_cost'
    )
    ;
                  name               | setting | unit |                category                |                                             short_desc                                             |                                                                extra_desc                                                                 |  context   | vartype | source  | min_val |   max_val    |     enumvals     | boot_val | reset_val | sourcefile | sourceline | pending_restart 
    ---------------------------------+---------+------+----------------------------------------+----------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+------------+---------+---------+---------+--------------+------------------+----------+-----------+------------+------------+-----------------
     force_parallel_mode             | off     |      | Query Tuning / Other Planner Options   | Forces use of parallel query facilities.                                                           | If possible, run query using a parallel worker and with parallel restrictions.                                                            | user       | enum    | default |         |              | {off,on,regress} | off      | off       |            |            | f
     max_parallel_workers            | 8       |      | Resource Usage / Asynchronous Behavior | Sets the maximum number of parallel workers than can be active at one time.                        |                                                                                                                                           | user       | integer | default | 0       | 1024         |                  | 8        | 8         |            |            | f
     max_parallel_workers_per_gather | 2       |      | Resource Usage / Asynchronous Behavior | Sets the maximum number of parallel processes per executor node.                                   |                                                                                                                                           | user       | integer | default | 0       | 1024         |                  | 2        | 2         |            |            | f
     max_worker_processes            | 8       |      | Resource Usage / Asynchronous Behavior | Maximum number of concurrent worker processes.                                                     |                                                                                                                                           | postmaster | integer | default | 0       | 262143       |                  | 8        | 8         |            |            | f
     min_parallel_index_scan_size    | 64      | 8kB  | Query Tuning / Planner Cost Constants  | Sets the minimum amount of index data for a parallel scan.                                         | If the planner estimates that it will read a number of index pages too small to reach this limit, a parallel scan will not be considered. | user       | integer | default | 0       | 715827882    |                  | 64       | 64        |            |            | f
     min_parallel_table_scan_size    | 1024    | 8kB  | Query Tuning / Planner Cost Constants  | Sets the minimum amount of table data for a parallel scan.                                         | If the planner estimates that it will read a number of table pages too small to reach this limit, a parallel scan will not be considered. | user       | integer | default | 0       | 715827882    |                  | 1024     | 1024      |            |            | f
     parallel_setup_cost             | 1000    |      | Query Tuning / Planner Cost Constants  | Sets the planner's estimate of the cost of starting up worker processes for parallel query.        |                                                                                                                                           | user       | real    | default | 0       | 1.79769e+308 |                  | 1000     | 1000      |            |            | f
     parallel_tuple_cost             | 0.1     |      | Query Tuning / Planner Cost Constants  | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend. |                                                                                                                                           | user       | real    | default | 0       | 1.79769e+308 |                  | 0.1      | 0.1       |            |            | f
    (8 rows)

    相比postgresql 9.6
    去掉了如下参数
    min_parallel_relation_size

    增加了如下参数
    max_parallel_workers
    min_parallel_index_scan_size
    min_parallel_table_scan_size

    简单测试

    create table tmp_t0  (
     c1 int8,
     c2 int8,
     c3 text,
     c4 text
    )
    ;

    生成1000w条数据

    insert into tmp_t0
    SELECT (random()*(2*10^9))::int8,
           (random()*(2*10^9))::int8,
           md5(random()::text),
           md5(random()::text)
      from generate_series(1,8000000)
     where 1=1
     ;
    select pg_size_pretty(pg_relation_size('tmp_t0'))
    
     pg_size_pretty 
    ----------------
     1116 MB
    (1 row)

    系统默认启用并行

    explain 
    select count(1)
    from tmp_t0
    where 1=1
    ;
    Finalize Aggregate  (cost=195941.86..195941.87 rows=1 width=8)
     ->  Gather  (cost=195941.65..195941.86 rows=2 width=8)
           Workers Planned: 2
           ->  Partial Aggregate  (cost=194941.65..194941.66 rows=1 width=8)
                 ->  Parallel Seq Scan on tmp_t0 (cost=0.00..184524.92 rows=4166692 width=0)

    禁止并行

    set max_parallel_workers_per_gather =0;
    
    explain 
    select count(1)
    from tmp_t0
    where 1=1
    ;
    
    Aggregate  (cost=267858.75..267858.76 rows=1 width=8)
     ->  Seq Scan on tmp_t0  (cost=0.00..242858.60 rows=10000060 width=0)

    session 级别的并行控制

     set max_parallel_workers=4;  
     set max_parallel_workers_per_gather =4;  
     set parallel_setup_cost =0;  
     set parallel_tuple_cost =0;  
    
     alter table test set (parallel_workers =4);  

    参考:
    http://postgres.cn/docs/10/parallel-query.html

  • 相关阅读:
    java编辑器eclipse如何更改jdk版本
    java 获取实体类对象属性值的方法
    java 时间处理
    java file的一些方法
    java中的包装类与装箱拆箱定义
    java中的分支
    java中的循环
    冒泡排序法
    HDFS数据迁移解决方案之DistCp工具的巧妙使用
    HDFS数据迁移解决方案之DistCp工具的巧妙使用
  • 原文地址:https://www.cnblogs.com/ctypyb2002/p/9793034.html
Copyright © 2011-2022 走看看