zoukankan      html  css  js  c++  java
  • PostgreSQL模糊查询

    1、创建测试表、插入数据

    postgres=# create table test_like(id int, name varchar);
    CREATE TABLE
    postgres=# insert into test_like select generate_series(1,1000000),md5(random()::varchar);
    INSERT 0 1000000
    postgres=# analyze test_like ;
    ANALYZE
    

    2、未创建索引的执行计划

    postgres=# explain select * from test_like where name like '111%' ;
                                 QUERY PLAN
    --------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Seq Scan on test_like  (cost=0.00..7950.56 rows=33 width=37)
             Filter: ((name)::text ~~ '111%'::text)
    (4 rows)
    

    3、普通btree索引不走索引

    postgres=# create index idx_test_like_name on test_like (name);
    CREATE INDEX
    postgres=# analyze test_like ;
    ANALYZE
    postgres=# explain select * from test_like where name like '111%' ;
                                 QUERY PLAN
    --------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Seq Scan on test_like  (cost=0.00..7950.56 rows=33 width=37)
             Filter: ((name)::text ~~ '111%'::text)
    (4 rows)
    

    4、前模糊匹配查询

    1)collate "C"

    postgres=# create index idx_test_like_name_c on test_like (name collate "C");
    CREATE INDEX
    postgres=# analyze test_like ;
    ANALYZE
    postgres=# explain select * from test_like where name like '111%' ;
                                             QUERY PLAN
    --------------------------------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Bitmap Heap Scan on test_like  (cost=2.99..108.88 rows=33 width=37)
             Filter: ((name)::text ~~ '111%'::text)
             ->  Bitmap Index Scan on idx_test_like_name_c  (cost=0.00..2.98 rows=56 width=0)
                   Index Cond: (((name)::text >= '111'::text) AND ((name)::text < '112'::text))
    (6 rows)
    

    2)操作符类varchar_pattern_ops方式

    postgres=# create index idx_test_like_name_ops on test_like (name varchar_pattern_ops);
    CREATE INDEX
    postgres=# analyze test_like ;
    ANALYZE
    postgres=# explain select * from test_like where name like '111%' ;
                                               QUERY PLAN
    ------------------------------------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Bitmap Heap Scan on test_like  (cost=2.97..105.20 rows=33 width=37)
             Filter: ((name)::text ~~ '111%'::text)
             ->  Bitmap Index Scan on idx_test_like_name_ops  (cost=0.00..2.96 rows=54 width=0)
                   Index Cond: (((name)::text ~>=~ '111'::text) AND ((name)::text ~<~ '112'::text))
    (6 rows)
    

    但后匹配、中间匹配不支持:

    postgres=# explain select * from test_like where name like '%111%1' ;
                                 QUERY PLAN
    --------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Seq Scan on test_like  (cost=0.00..7950.56 rows=33 width=37)
             Filter: ((name)::text ~~ '%111%1'::text)
    (4 rows)
    
    postgres=# explain select * from test_like where name like '%111' ;
                                 QUERY PLAN
    --------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Seq Scan on test_like  (cost=0.00..7950.56 rows=33 width=37)
             Filter: ((name)::text ~~ '%111'::text)
    (4 rows)
    
    postgres=# explain select * from test_like where name like '111%' ;
                                               QUERY PLAN
    ------------------------------------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Bitmap Heap Scan on test_like  (cost=2.98..107.04 rows=33 width=37)
             Filter: ((name)::text ~~ '111%'::text)
             ->  Bitmap Index Scan on idx_test_like_name_ops  (cost=0.00..2.97 rows=55 width=0)
                   Index Cond: (((name)::text ~>=~ '111'::text) AND ((name)::text ~<~ '112'::text))
    (6 rows)
    
    postgres=# explain select * from test_like where name like '%111%' ;
                                 QUERY PLAN
    --------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Seq Scan on test_like  (cost=0.00..7950.56 rows=33 width=37)
             Filter: ((name)::text ~~ '%111%'::text)
    (4 rows)
    

    5、方式二:使用pg_trim插件

    The pg_trgm module provides functions and operators for determining the similarity of alphanumeric text based on trigram matching, as well as index operator classes that support fast searching for similar strings.

    A trigram is a group of three consecutive characters taken from a string. We can measure the similarity of two strings by counting the number of trigrams they share. This simple idea turns out to be very effective for measuring the similarity of words in many natural languages.

    Note: pg_trgm ignores non-word characters (non-alphanumerics) when extracting trigrams from a string. Each word is considered to have two spaces prefixed and one space suffixed when determining the set of trigrams contained in the string. For example, the set of trigrams in the string "cat" is " c", " ca", "cat", and "at ". The set of trigrams in the string "foo|bar" is " f", " fo", "foo", "oo ", " b", " ba", "bar", and "ar ".

    postgres=# create extension pg_trgm;
    CREATE EXTENSION
    postgres=# create index idx_test_like_name_trgm on test_like using gin (name gin_trgm_ops);
    CREATE INDEX
    postgres=# analyze test_like ;
    ANALYZE
    postgres=# explain select * from test_like where name like '111%' ;
                                              QUERY PLAN
    ----------------------------------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Bitmap Heap Scan on test_like  (cost=18.26..81.59 rows=33 width=37)
             Recheck Cond: ((name)::text ~~ '111%'::text)
             ->  Bitmap Index Scan on idx_test_like_name_trgm  (cost=0.00..18.25 rows=33 width=0)
                   Index Cond: ((name)::text ~~ '111%'::text)
    (6 rows)
    
    postgres=# explain select * from test_like where name like '%111%' ;
                                               QUERY PLAN
    ------------------------------------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Bitmap Heap Scan on test_like  (cost=62.16..3816.37 rows=6731 width=37)
             Recheck Cond: ((name)::text ~~ '%111%'::text)
             ->  Bitmap Index Scan on idx_test_like_name_trgm  (cost=0.00..60.48 rows=6731 width=0)
                   Index Cond: ((name)::text ~~ '%111%'::text)
    (6 rows)
    
    postgres=# explain select * from test_like where name like '%111' ;
                                              QUERY PLAN
    ----------------------------------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Bitmap Heap Scan on test_like  (cost=12.26..75.59 rows=33 width=37)
             Recheck Cond: ((name)::text ~~ '%111'::text)
             ->  Bitmap Index Scan on idx_test_like_name_trgm  (cost=0.00..12.25 rows=33 width=0)
                   Index Cond: ((name)::text ~~ '%111'::text)
    (6 rows)
    
    postgres=# explain select * from test_like where name like '%111%1' ;
                                             QUERY PLAN
    ---------------------------------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Bitmap Heap Scan on test_like  (cost=6.26..69.59 rows=33 width=37)
             Recheck Cond: ((name)::text ~~ '%111%1'::text)
             ->  Bitmap Index Scan on idx_test_like_name_trgm  (cost=0.00..6.25 rows=33 width=0)
                   Index Cond: ((name)::text ~~ '%111%1'::text)
    (6 rows)
    
    

    缺陷 :就像上面描述的那样,pg_trgm把字符串切成多个三元组,然后做匹配,当匹配关键词少于三个的时候,只有前缀匹配会走索引:

    postgres=# explain select * from test_like where name like '%11%' ;
                                  QUERY PLAN
    -----------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Seq Scan on test_like  (cost=0.00..7950.56 rows=40384 width=37)
             Filter: ((name)::text ~~ '%11%'::text)
    (4 rows)
    
    postgres=# explain select * from test_like where name like '%111%' ;
                                               QUERY PLAN
    ------------------------------------------------------------------------------------------------
     Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0)
       Node/s: dn001, dn002, dn003
       ->  Bitmap Heap Scan on test_like  (cost=34.08..2908.28 rows=3365 width=37)
             Recheck Cond: ((name)::text ~~ '%111%'::text)
             ->  Bitmap Index Scan on idx_test_like_name_trgm  (cost=0.00..33.24 rows=3365 width=0)
                   Index Cond: ((name)::text ~~ '%111%'::text)
    (6 rows)
    
    严以律己、宽以待人
  • 相关阅读:
    鼠标移入和鼠标移出的提示,和样式的转换
    HTML5——新特性,拖放
    关于订阅发布模式
    titanium环境配置
    Matlab与C混编的介绍
    一个相对健壮的node 静态http服务器
    阻赛非阻塞同步异步
    最近在做的事以及一些安排
    说一说js中__proto__和prototype以及原型继承的那些事
    PHP写的爬虫,爬指定网站页面上的各种图片
  • 原文地址:https://www.cnblogs.com/kuang17/p/15647129.html
Copyright © 2011-2022 走看看