zoukankan      html  css  js  c++  java
  • postgresql----Gist索引

    GiST的意思是通用的搜索树(Generalized Search Tree)。 它是一种平衡树结构的访问方法,在系统中作为一个基本模版,可以使用它实现任意索引模式。B-trees, R-trees和许多其它的索引模式都可以用GiST实现。

    上面一段高能的官方解释有点难以理解,暂时也不需要使用Gist实现其他的索引模式,就简单的介绍下Gist索引如何使用,

    与Btree索引比较的优缺点:

    优点:

    Gist索引适用于多维数据类型和集合数据类型,和Btree索引类似,同样适用于其他的数据类型。和Btree索引相比,Gist多字段索引在查询条件中包含索引字段的任何子集都会使用索引扫描,而Btree索引只有查询条件包含第一个索引字段才会使用索引扫描。

    缺点:

    Gist索引创建耗时较长,占用空间也比较大。

    测试表

    test=# create table tbl_index(a bigint,b timestamp without time zone,c varchar(12));
    CREATE TABLE
    test=# insert into tbl_index (a,b,c)  select generate_series(1,3000000),clock_timestamp()::timestamp(0) without time zone,'got u';
    INSERT 0 3000000
    test=# 	iming 
    Timing is on.

    创建Gist索引的前提是已经编译并安装了Gist的扩展,因为我源码编译时已经编译安装了所有的扩展,所以这里只需要在数据库中创建扩展即可。

    test=# create extension btree_gist;
    CREATE EXTENSION
    Time: 774.131 ms

    创建索引

    test=# create index idx_gist_tbl_index_a_b on tbl_index using gist(a,b);
    CREATE INDEX
    Time: 168595.321 ms

    示例1.使用字段a查询

    test=# explain analyze select * from tbl_index where a=3000000;
                                                            QUERY PLAN                                                         
    ---------------------------------------------------------------------------------------------------------------------------
     Gather  (cost=1000.00..21395.10 rows=1 width=22) (actual time=310.514..310.517 rows=1 loops=1)
       Workers Planned: 2
       Workers Launched: 2
       ->  Parallel Seq Scan on tbl_index  (cost=0.00..20395.00 rows=0 width=22) (actual time=289.432..289.433 rows=0 loops=3)
             Filter: (a = 3000000)
             Rows Removed by Filter: 1000000
     Planning time: 0.119 ms
     Execution time: 310.631 ms
    (8 rows)
    
    Time: 311.505 ms
    test=# explain analyze select * from tbl_index where a='3000000';
                                                                QUERY PLAN                                                             
    -----------------------------------------------------------------------------------------------------------------------------------
     Index Scan using idx_gist_tbl_index_a_b on tbl_index  (cost=0.29..8.30 rows=1 width=22) (actual time=0.104..0.105 rows=1 loops=1)
       Index Cond: (a = '3000000'::bigint)
     Planning time: 0.109 ms
     Execution time: 0.297 ms
    (4 rows)
    
    Time: 1.124 ms

    以上两条SQL语句的区别在于第一条SQL语句按照a的类型bigint去查询,而第二条SQL语句却将bigint转成char类型查询,但是结果显示char类型的查询(索引扫描)性能远高于bigint的查询(全表扫描)性能,怀疑是不是创建索引时将bigint转成char类型了(只是猜测),反正Gist索引查询最好使用char。

    示例2.使用字段b查询

    test=# explain analyze select * from tbl_index where b='2016-06-29 14:54:00';
                                                                      QUERY PLAN                                                         
             
    -------------------------------------------------------------------------------------------------------------------------------------
    ---------
     Bitmap Heap Scan on tbl_index  (cost=3373.54..10281.04 rows=171000 width=22) (actual time=37.200..53.564 rows=172824 loops=1)
       Recheck Cond: (b = '2016-06-29 14:54:00'::timestamp without time zone)
       Heap Blocks: exact=276
       ->  Bitmap Index Scan on idx_gist_tbl_index_a_b  (cost=0.00..3330.79 rows=171000 width=0) (actual time=37.139..37.139 rows=172824 
    loops=1)
             Index Cond: (b = '2016-06-29 14:54:00'::timestamp without time zone)
     Planning time: 0.343 ms
     Execution time: 60.843 ms
    (7 rows)
    
    Time: 62.359 ms

    该查询不包含第一个索引字段,但是仍使用索引扫描,而此条件下Btree索引只能全表扫描。

    示例3.使用a and b查询

    test=# explain analyze select * from tbl_index where a='3000000' and b='2016-06-29 14:54:00';
                                                                QUERY PLAN                                                             
    -----------------------------------------------------------------------------------------------------------------------------------
     Index Scan using idx_gist_tbl_index_a_b on tbl_index  (cost=0.29..8.31 rows=1 width=22) (actual time=0.114..0.115 rows=1 loops=1)
       Index Cond: ((a = '3000000'::bigint) AND (b = '2016-06-29 14:54:00'::timestamp without time zone))
     Planning time: 0.376 ms
     Execution time: 0.258 ms
    (4 rows)
    
    Time: 1.747 ms

    示例4.使用a or b查询

    test=# explain analyze select * from tbl_index where a='3000000' or b='2016-06-29 14:54:00';
                                                                         QUERY PLAN                                                      
                   
    -------------------------------------------------------------------------------------------------------------------------------------
    ---------------
     Bitmap Heap Scan on tbl_index  (cost=3420.58..10755.60 rows=171001 width=22) (actual time=31.142..49.728 rows=172824 loops=1)
       Recheck Cond: ((a = '3000000'::bigint) OR (b = '2016-06-29 14:54:00'::timestamp without time zone))
       Heap Blocks: exact=276
       ->  BitmapOr  (cost=3420.58..3420.58 rows=171001 width=0) (actual time=31.083..31.083 rows=0 loops=1)
             ->  Bitmap Index Scan on idx_gist_tbl_index_a_b  (cost=0.00..4.29 rows=1 width=0) (actual time=0.100..0.100 rows=1 loops=1)
                   Index Cond: (a = '3000000'::bigint)
             ->  Bitmap Index Scan on idx_gist_tbl_index_a_b  (cost=0.00..3330.79 rows=171000 width=0) (actual time=30.981..30.981 rows=1
    72824 loops=1)
                   Index Cond: (b = '2016-06-29 14:54:00'::timestamp without time zone)
     Planning time: 0.143 ms
     Execution time: 57.193 ms
    (10 rows)
    
    Time: 58.067 ms

    使用and和or查询虽然也是索引扫描,但是和Btree索引相比并没有性能提升。

    比较Gist索引和Btree索引的创建耗时和大小

    btree索引耗时:

    test=# create index idx_btree_tbl_index_a_b on tbl_index using btree(a,b);
    CREATE INDEX
    Time: 5217.976 ms

    Gist索引耗时从上面看到是168595.321 ms,是Btree索引耗时的32倍。

    大小比较,结果显示Gist索引是Btree索引的3倍多。

    test=# select relname,pg_size_pretty(pg_relation_size(oid)) from pg_class where relname like 'idx_%_tbl_index_a_b';
             relname         | pg_size_pretty 
    -------------------------+----------------
     idx_gist_tbl_index_a_b  | 281 MB
     idx_btree_tbl_index_a_b | 89 MB
    (2 rows)
    
    Time: 4.068 ms
  • 相关阅读:
    131. Palindrome Partitioning
    130. Surrounded Regions
    129. Sum Root to Leaf Numbers
    128. Longest Consecutive Sequence
    125. Valid Palindrome
    124. Binary Tree Maximum Path Sum
    122. Best Time to Buy and Sell Stock II
    121. Best Time to Buy and Sell Stock
    120. Triangle
    119. Pascal's Triangle II
  • 原文地址:https://www.cnblogs.com/alianbog/p/5628543.html
Copyright © 2011-2022 走看看