zoukankan      html  css  js  c++  java
  • postgresql在linux上会cache哪些内容

    在做pgbench基准测试的时候,如果只是仅仅关注tps,而不知道数据是从物理磁盘中获取,还是从shared buffer缓存、文件系统缓存、或存储缓存中命中的话,意义不是很大。linux提供的基础指标容易产生误导,很多人看了之后也不知道真正衡量的是什么。

     

    下载并安装postgresql 11.12:

    wget https://ftp.postgresql.org/pub/source/v11.12/postgresql-11.12.tar.gz
    ./configure --enable-debug
    make
    make install
    adduser postgres
    mkdir /usr/local/pgsql/data
    chown postgres /usr/local/pgsql/data
    su - postgres
    /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
    /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l logfile start
    

     

    下载并安装linux-ftools:

    linux-ftools提供的fincore工具可以显示文件是如何映射到文件系统缓存的。

    wget https://download.opensuse.org/repositories/home:/bmanojlovic/RHEL_7/x86_64/linux-ftools-1.3-5.1.x86_64.rpm
    rpm -ivh linux-ftools-1.3-5.1.x86_64.rpm
    

      

    安装extension

    pg_prewarm extension可以快速地将表加载到cache中。

    cd contrib 
    make & & make install
    /usr/local/pgsql/bin/psql postgres <<<" create extension pg_prewarm; "
    

      

    准备数据,每个scale=64

    $ /usr/local/pgsql/bin/pgbench -i -s 64 postgres
    dropping old tables...
    NOTICE:  table "pgbench_accounts" does not exist, skipping
    NOTICE:  table "pgbench_branches" does not exist, skipping
    NOTICE:  table "pgbench_history" does not exist, skipping
    NOTICE:  table "pgbench_tellers" does not exist, skipping
    creating tables...
    generating data...
    100000 of 6400000 tuples (1%) done (elapsed 0.08 s, remaining 5.25 s)
    200000 of 6400000 tuples (3%) done (elapsed 0.22 s, remaining 6.70 s)
    300000 of 6400000 tuples (4%) done (elapsed 0.36 s, remaining 7.22 s)
    400000 of 6400000 tuples (6%) done (elapsed 0.46 s, remaining 6.86 s)
    500000 of 6400000 tuples (7%) done (elapsed 0.60 s, remaining 7.03 s)
    600000 of 6400000 tuples (9%) done (elapsed 0.72 s, remaining 6.99 s)
    700000 of 6400000 tuples (10%) done (elapsed 0.81 s, remaining 6.63 s)
    800000 of 6400000 tuples (12%) done (elapsed 0.94 s, remaining 6.59 s)
    900000 of 6400000 tuples (14%) done (elapsed 1.04 s, remaining 6.35 s)
    1000000 of 6400000 tuples (15%) done (elapsed 1.16 s, remaining 6.25 s)
    1100000 of 6400000 tuples (17%) done (elapsed 1.30 s, remaining 6.27 s)
    1200000 of 6400000 tuples (18%) done (elapsed 1.40 s, remaining 6.06 s)
    1300000 of 6400000 tuples (20%) done (elapsed 1.52 s, remaining 5.98 s)
    1400000 of 6400000 tuples (21%) done (elapsed 1.67 s, remaining 5.96 s)
    1500000 of 6400000 tuples (23%) done (elapsed 1.76 s, remaining 5.76 s)
    1600000 of 6400000 tuples (25%) done (elapsed 1.89 s, remaining 5.67 s)
    1700000 of 6400000 tuples (26%) done (elapsed 2.05 s, remaining 5.66 s)
    1800000 of 6400000 tuples (28%) done (elapsed 2.13 s, remaining 5.45 s)
    1900000 of 6400000 tuples (29%) done (elapsed 2.27 s, remaining 5.38 s)
    2000000 of 6400000 tuples (31%) done (elapsed 2.41 s, remaining 5.30 s)
    2100000 of 6400000 tuples (32%) done (elapsed 2.50 s, remaining 5.12 s)
    2200000 of 6400000 tuples (34%) done (elapsed 2.63 s, remaining 5.02 s)
    2300000 of 6400000 tuples (35%) done (elapsed 2.73 s, remaining 4.87 s)
    2400000 of 6400000 tuples (37%) done (elapsed 2.85 s, remaining 4.76 s)
    2500000 of 6400000 tuples (39%) done (elapsed 2.99 s, remaining 4.66 s)
    2600000 of 6400000 tuples (40%) done (elapsed 3.09 s, remaining 4.51 s)
    2700000 of 6400000 tuples (42%) done (elapsed 3.20 s, remaining 4.38 s)
    2800000 of 6400000 tuples (43%) done (elapsed 3.34 s, remaining 4.30 s)
    2900000 of 6400000 tuples (45%) done (elapsed 3.43 s, remaining 4.13 s)
    3000000 of 6400000 tuples (46%) done (elapsed 3.56 s, remaining 4.04 s)
    3100000 of 6400000 tuples (48%) done (elapsed 3.64 s, remaining 3.88 s)
    3200000 of 6400000 tuples (50%) done (elapsed 3.78 s, remaining 3.78 s)
    3300000 of 6400000 tuples (51%) done (elapsed 3.91 s, remaining 3.67 s)
    3400000 of 6400000 tuples (53%) done (elapsed 4.01 s, remaining 3.54 s)
    3500000 of 6400000 tuples (54%) done (elapsed 4.14 s, remaining 3.43 s)
    3600000 of 6400000 tuples (56%) done (elapsed 4.28 s, remaining 3.33 s)
    3700000 of 6400000 tuples (57%) done (elapsed 4.38 s, remaining 3.20 s)
    3800000 of 6400000 tuples (59%) done (elapsed 4.50 s, remaining 3.08 s)
    3900000 of 6400000 tuples (60%) done (elapsed 4.65 s, remaining 2.98 s)
    4000000 of 6400000 tuples (62%) done (elapsed 4.75 s, remaining 2.85 s)
    4100000 of 6400000 tuples (64%) done (elapsed 4.89 s, remaining 2.74 s)
    4200000 of 6400000 tuples (65%) done (elapsed 4.99 s, remaining 2.61 s)
    4300000 of 6400000 tuples (67%) done (elapsed 5.14 s, remaining 2.51 s)
    4400000 of 6400000 tuples (68%) done (elapsed 5.29 s, remaining 2.40 s)
    4500000 of 6400000 tuples (70%) done (elapsed 5.43 s, remaining 2.29 s)
    4600000 of 6400000 tuples (71%) done (elapsed 5.53 s, remaining 2.16 s)
    4700000 of 6400000 tuples (73%) done (elapsed 5.67 s, remaining 2.05 s)
    4800000 of 6400000 tuples (75%) done (elapsed 5.80 s, remaining 1.93 s)
    4900000 of 6400000 tuples (76%) done (elapsed 5.89 s, remaining 1.80 s)
    5000000 of 6400000 tuples (78%) done (elapsed 6.02 s, remaining 1.69 s)
    5100000 of 6400000 tuples (79%) done (elapsed 6.12 s, remaining 1.56 s)
    5200000 of 6400000 tuples (81%) done (elapsed 6.24 s, remaining 1.44 s)
    5300000 of 6400000 tuples (82%) done (elapsed 6.37 s, remaining 1.32 s)
    5400000 of 6400000 tuples (84%) done (elapsed 6.47 s, remaining 1.20 s)
    5500000 of 6400000 tuples (85%) done (elapsed 6.59 s, remaining 1.08 s)
    5600000 of 6400000 tuples (87%) done (elapsed 6.68 s, remaining 0.95 s)
    5700000 of 6400000 tuples (89%) done (elapsed 6.82 s, remaining 0.84 s)
    5800000 of 6400000 tuples (90%) done (elapsed 6.96 s, remaining 0.72 s)
    5900000 of 6400000 tuples (92%) done (elapsed 7.10 s, remaining 0.60 s)
    6000000 of 6400000 tuples (93%) done (elapsed 7.20 s, remaining 0.48 s)
    6100000 of 6400000 tuples (95%) done (elapsed 7.33 s, remaining 0.36 s)
    6200000 of 6400000 tuples (96%) done (elapsed 7.42 s, remaining 0.24 s)
    6300000 of 6400000 tuples (98%) done (elapsed 7.54 s, remaining 0.12 s)
    6400000 of 6400000 tuples (100%) done (elapsed 7.69 s, remaining 0.00 s)
    vacuuming...
    creating primary keys...
    done.
    

    大概1.8G

    $ du -sh data/
    1.8G    data/
    $ grep ^shared data/postgresql.conf
    shared_buffers = 2048MB 
    

    没有开启大页,这里free显示的统计信息都是关于小页的。

     

    为了做实验,这里我重启了一下操作系统。

    $ free -mw
                  total        used        free      shared     buffers       cache   available
    Mem:          15885         478       14927           9           2         477       15230
    Swap:         10239           0       10239
    

    free命令会显示所有的空闲的、已经使用的物理内存和swap内存,包含内核使用的buffers和caches。信息来自/proc/meminfo 其中:

    ·total:全部的内存(/proc/meminfo中的MemTotal和SwapTotal)

    ·used:已经使用的内存(=total-free-buffers-cache)

    ·free:未使用的内存(/proc/meminfo中的MemFree和SwapFree)

    ·shared:tmpfs使用的内存(/proc/meminfo中的Shmem)

    ·buffers:内核使用的buffers(Buffers in /proc/meminfo)

    ·cache:page cache 和 slabs使用的内存 (Cached and Slab in /proc/meminfo)

    ·buff/cache:buffers和cache之和

    ·available:评估出来的,新应用可以使用到的内存,不包含swap

     

    重启后,系统几乎没有使用共享内存。477MB被cache了,14927MB是空闲的。

     

    $ /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l logfile start
    waiting for server to start.... done
    server started
    $ free -mw
                  total        used        free      shared     buffers       cache   available
    Mem:          15885         480       14853          72           2         549       15165
    Swap:         10239           0       10239
    

    启动postgresql实例后,几乎没有什么改变。因为linux采用的是lazy分配SHM的方式:shared buffers在使用的时候才会分配,因此这里只是增加了一点点。

    $ /usr/local/pgsql/bin/psql postgres <<<" select pg_size_pretty(current_setting('block_size')::int*pg_prewarm('pgbench_accounts','buffer')); "
     pg_size_pretty 
    ----------------
     820 MB
    (1 row)
    
    $ free -mw
                  total        used        free      shared     buffers       cache   available
    Mem:          15885         479       13207         892           2        2196       14344
    Swap:         10239           0       10239
    $ 
    

    我加载了820MB的pgbench_account表。首先是读入文件系统缓存,增加了cache的大小,然后拷贝到shared buffer cache中。这里shared从72增加到了892。但是因为cache内容包含了shared部分,显示为2196MB。有意思的是available减少了821MB,因为shared buffers不能被回收,而一旦需要物理内存执行sync的时候就可以回收file cache。

    在postgresql中,会在用户空间(userspace)分配一个共享段用于管理对页的并发写和频繁的读。但是与大多数RDBMS不同,这里没有进一步的优化,比如读的预取,写的重新排序。这意味着我们必须有足够的空间用于文件系统缓存。频繁读取的数据没有放入shared_buffers中,就需要从该共享段进行缓存。这就是postgresql的双缓存,双缓存都被统计在free命令的"cache"部分。

     

    为了更好的预测性能。必须要理解文件系统缓存。可以借助工具linux-ftools。以下是linux-fincore获得的缓存的文件的信息:

    $ linux-fincore -s $(find /usr/local/pgsql/data/base -type f) | grep ^total
    total cached size: 860,930,048
    

    这里是821MB。不幸的是,你需要传递文件列表进行详细检查,所以这里只用了total。

     

    列出cache中超过1MB的文件:

    $ linux-fincore -s -S 1048576  $(find /usr/local/pgsql/data/base -type f)
    filename                                                                                       size        total_pages    min_cached page       cached_pages        cached_size        cached_perc
    --------                                                                                       ----        -----------    ---------------       ------------        -----------        -----------
    /usr/local/pgsql/data/base/13287/16400                                                  859,496,448            209,838                  0            209,838        859,496,448             100.00
    /usr/local/pgsql/data/base/13287/16408                                                  143,777,792             35,102                 -1                  0                  0               0.00
    ---
    total cached size: 859,496,448
    

      

    也可以借助pg_buffercache extension,快速地获得文件系统的大小

    $ /usr/local/pgsql/bin/psql postgres <<<" select relname,relkind,current_setting('block_size')::int*relpages/1024/1024 as size_MB,current_setting('block_size')::int*buffers/1024/1024 as shared_mb,relfilenode,current_setting('data_directory')||'/'||pg_relation_filepath(c.oid) as file_path from pg_class c left outer join (select relfilenode, count(*) as buffers from pg_buffercache group by relfilenode) b using(relfilenode) where relpages>100;" | awk '/[/]base[/]/{"linux-fincore -s "$NF"* | grep ^total | cut -d: -f2" | getline f;printf "%6dM %s
    ",gensub(/,/,"","g",f)/1024/1024,$0;next}{printf "%6s  %s
    ","",$0}'
                    relname        | relkind | size_mb | shared_mb | relfilenode |               file_path                
            -----------------------+---------+---------+-----------+-------------+----------------------------------------
       819M  pgbench_accounts      | r       |     819 |       819 |       16400 | /usr/local/pgsql/data/base/13287/16400
         0M  pgbench_accounts_pkey | i       |     137 |           |       16408 | /usr/local/pgsql/data/base/13287/16408
            (2 rows)
            
    $ 
    

      

    释放刷新cache

    # echo 3 > /proc/sys/vm/drop_caches;
    # free -mw
                  total        used        free      shared     buffers       cache   available
    Mem:          15885         465       14322         894           0        1096       14362
    Swap:         10239           0       10239
    

    shared buffer仍然保持着。

     

    关闭postgresql实例后,shared buffer的内容被释放。

    $ /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l logfile stop
    waiting for server to shut down.... done
    server stopped
    $ free -mw
                  total        used        free      shared     buffers       cache   available
    Mem:          15885         463       15210           9           0         211       15251
    Swap:         10239           0       10239
    $ 
    

      

    如果你还想知道,这cache中的211MB是哪些内容

    $ find / -type f -exec linux-fincore {} ; 2>/dev/null |  awk '/^[/]/{gsub(/,/,"");m=$(NF-1)/1024/1024;gsub(/  */," ");if(m>1)printf "%10d MB %s
    ",m,$0}' | sort -h | tee all.log | sort -n | tail -10
             2 MB /usr/lib64/dri/vmwgfx_dri.so 11038656 2695 0 708 2899968 26.27
             2 MB /usr/lib64/gnome-shell/libgnome-shell.so 2844392 695 0 625 2560000 89.93
             2 MB /usr/lib64/libc-2.17.so 2156352 527 0 527 2158592 100.00
             2 MB /usr/lib64/libgtk-3.so.0.2200.26 7503704 1832 0 592 2424832 32.31
             6 MB /usr/lib64/gjs/libmozjs-52.so 24401928 5958 0 1646 6742016 27.63
             6 MB /usr/local/pgsql/bin/postgres 33821416 8258 0 1559 6385664 18.88
             8 MB /usr/lib/locale/locale-archive 106176928 25923 0 2048 8388608 7.90
            12 MB /usr/lib64/libLLVM-5.0-rhel.so 37261600 9098 0 3181 13029376 34.96
            13 MB /usr/bin/containerd 49102360 11988 0 3516 14401536 29.33
            25 MB /usr/bin/dockerd 105055544 25649 0 6503 26636288 25.35
    

    这需要执行一些时间,但是可以显示cache的都是哪些文件。

     

    到此,就可以理解cache统计信息的含义了。

     

    所有操作都在shared_buffers=4GB的情况下运行的。橙色和绿色区域的值是pgbench运行后从“free -wm”中获取的值的大小的增长(以MB为单位)。

    当scale小于70时,pg_bench数据库的大小是1054MB(黄色),我没有看到额外分配SHM(橙色),文件系统“缓存”(绿色)随着从文件中读取的大小而增加。

    然后,当scale增加到270时,SHM将分配内存以适应shared_buffers(橙色)中的这些页面。“cache”(绿色)包括SHM和文件系统的缓存。

    超过280,SHM不会更高,因为我们达到了shared_buffer的4GB。在操作系统级别仍然有空闲内存,然后“cache”会随着读取数据的大小而增加。

    我还添加了性能指标“tps”——每秒的事务量。在shared_buffers的范围内运行这个仅查询的工作负载会显示出稳定的性能(在单线程中约为13500 tps)。

     

    当需要从文件系统缓存中拷贝一些页面时,性能会缓慢降低。 当到达超出物理RAM容量的范围时,物理读取就会涉及到,tps就会降低很多。 这里有14GB的RAM,当scale达到了570,除了4GB的shared_buffers外,还有8GB用于频繁读取数据。

     

    以上说明了PostgreSQL双缓存的行为。为了获取最大性能,你需要了解需要频繁读取的数据。以及物理内存中必须可用的额外大小,以避免I/O延迟。这仅仅是查询的工作负载。现在假设有10%的数据被更新了…shared_buffers中新的元组,新的页面,在检查点后写入文件缓存和sync,还有通常开启了全页写的WAL也在那里。你可能需要更多的内存。

      

     

  • 相关阅读:
    提高编程能力的7条建议(转载)
    1.2、Mybatis二级缓存测试
    1.1、Mybatis一级缓存测试
    外键约束
    【转】MyBatis学习总结(四)——解决字段名与实体类属性名不相同的冲突
    【转】MyBatis学习总结(三)——优化MyBatis配置文件中的配置
    【转】The content of element type "configuration" must match "(properties?,settings?,typeAliases?,typeHandlers?,objectFactory?...
    【转】MyBatis学习总结(二)——使用MyBatis对表执行CRUD操作
    【转】MyBatis学习总结(一)——MyBatis快速入门
    java面板
  • 原文地址:https://www.cnblogs.com/abclife/p/14869975.html
Copyright © 2011-2022 走看看