zoukankan      html  css  js  c++  java
  • ElasticSearch中term和match探索

    一.创建测试数据

    1.创建一个index

    curl -X PUT  http://127.0.0.1:9200/student?pretty -H "Content-Type: application/json" -d '{
        "settings": {
            "number_of_shards": 1,
            "number_of_replicas": 0
        },
        "mappings": {
            "_source": {
                "enabled": true
            },
            "properties": {
                "id": {
                    "type": "integer"
                },
                "name": {
                    "type": "text"
                },
                "age": {
                    "type": "integer"
                },
                "class": {
                    "type": "text",
                    "analyzer": "ik_max_word"
                },
                "introduce": {
                    "type": "text",
                    "analyzer": "ik_max_word"
                }
            }
        }
    }'
    

    2.验证是否创建成功

    curl -XGET "http://127.0.0.1:9200/student?pretty"
    

    3.插入测试数据

    curl -X PUT http://127.0.0.1:9200/student/_doc/1?pretty -H "Content-Type: application/json" -d '{
        "id":1,
        "name":"关云长",
        "age":30,
    	"class":"蜀国一班"
    }'
    
    curl -X PUT http://127.0.0.1:9200/student/_doc/2?pretty -H "Content-Type: application/json" -d '{
        "id":2,
        "name":"吕蒙",
        "age":25,
    	"class":"吴国一班"
    }'
    
    curl -X PUT http://127.0.0.1:9200/student/_doc/3?pretty -H "Content-Type: application/json" -d '{
        "id":3,
        "name":"吕布",
        "age":40,
    	"class":"三姓一班"
    }'
    
    curl -X PUT http://127.0.0.1:9200/student/_doc/4?pretty -H "Content-Type: application/json" -d '{
        "id":4,
        "name":"张翼德",
        "age":30,
    	"class":"蜀国二班"
    }'
    

    4.查询所有数据,验证是否正确

    curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '
    {
        "query": {
            "match_all": {}
        }
    }'
    

    二.验证

    
    #关于term和match,下面两个查询,term没有结果,match有结果,为什么?
    curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '{
        "query": {
               "term": {"name":"吕蒙"}
        }
    }'
    
    
    curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '{
        "query": {
               "match": {"name":"吕蒙"}
        }
    }'
    

    拿A去B里匹配,A能分词,B也能分词。term不会将A分词,match会将A分词,存储数据类型keyword不会将B分词,text会将B分词。

      可以看到上面用term方式查找,没有结果,而用match方式查找,能查找到“吕蒙”和“吕布”两个结果

      term是不分词(不拆分搜索字)查找目标字段中是否有要查找的文字,也就是完整查找“吕蒙”两个字,而name这个字段用的是text类型存储的,text类型数据默认是分词的,也就是elasticsearch会将name分词后(分成“吕”和“蒙”)再存储,这时候拿完整的搜索字“吕蒙”去存储的“吕”、“蒙”里找肯定是找不到的。

      match是分词(拆分搜索字)查找目标字段,也就是说会先将要查找的搜索子“吕蒙”拆成“吕”和“蒙”,再分别去name里找“吕”,如果没有找到“吕”,还会去找“蒙”,而存储的数据里,text已经将“吕蒙”和“吕布”都分词成了“吕”,“蒙”,“吕”,“布”存储了,所以光通过一个“吕”字就能找到两条结果。

      这里要区分搜索词的分词,以及字段存储的分词。拿A去B里匹配,A能分词,B也能分词。term不会将A分词,match会将A分词。

      既然name的类型,存储的时候就是分词的,那能不能在存储的时候不分词了,可以用将text类型改成keyword类型

    #删除所有文档
    curl -XPOST "http://127.0.0.1:9200/student/_delete_by_query?pretty" -v -H "Content-Type: application/json" -d '
    {
        "query": {
            "match_all": {}
        }
    }'
    
    #删除索引
    curl -XDELETE "http://127.0.0.1:9200/student?pretty"
    
    #重新创建索引,将name字段的类型改成keyword
    curl -X PUT  http://127.0.0.1:9200/student?pretty -H "Content-Type: application/json" -d '{
        "settings": {
            "number_of_shards": 1,
            "number_of_replicas": 0
        },
        "mappings": {
            "_source": {
                "enabled": true
            },
            "properties": {
                "id": {
                    "type": "integer"
                },
                "name": {
                    "type": "keyword"
                },
                "age": {
                    "type": "integer"
                },
                "class": {
                    "type": "text",
                    "analyzer": "ik_max_word"
                },
                "introduce": {
                    "type": "text",
                    "analyzer": "ik_max_word"
                }
            }
        }
    }'
    
    #重新插入上面四条数据
    
    #请复制上面的语句,执行
    
    #下面这条查询将返回“吕蒙”同学
    curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '{
        "query": {
               "term": {"name":"吕蒙"}
        }
    }'
    
    
    #下面这条查询将返回0结果,因为存储时类型为keyword没有分词,所以存储的是“吕蒙”和“吕布”,这时候拿#“吕”去匹配,没有匹配的结果
    curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '{
        "query": {
               "term": {"name":"吕"}
        }
    }'
    
    #下面的结果将只会返回“吕蒙”同学,没有匹配的结果,因为存储时类型为keyword没有分词,所以存储的“吕
    #蒙”和“吕布”,这时候拿“吕蒙”去匹配,虽然用的match,会将搜索词拆分成“吕蒙”,“吕”,“蒙”去搜索,但
    #“吕”和“蒙”都不会匹配的到存储的“吕蒙”和“吕布”
    curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '{
        "query": {
               "match": {"name":"吕蒙"}
        }
    }'
    
    
  • 相关阅读:
    HDU 5492 Find a path
    codeforce gym 100548H The Problem to Make You Happy
    Topcoder SRM 144 Lottery
    codeforce 165E Compatible Numbers
    codeforce gym 100307H Hack Protection
    区间DP总结
    UESTC 1321 柱爷的恋爱 (区间DP)
    HDU 4283 You Are the One (区间DP)
    HDU 2476 String painter (区间DP)
    UESTC 426 Food Delivery (区间DP)
  • 原文地址:https://www.cnblogs.com/werben/p/11888792.html
Copyright © 2011-2022 走看看