zoukankan      html  css  js  c++  java
  • es中的term和match的区别

    term用法

    先看看term的定义,term是代表完全匹配,也就是精确查询,搜索前不会再对搜索词进行分词拆解。

    这里通过例子来说明,先存放一些数据:

    {
        "title": "love China",
        "content": "people very love China",
        "tags": ["China", "love"]
    }
    {
        "title": "love HuBei",
        "content": "people very love HuBei",
        "tags": ["HuBei", "love"]
    }

    来使用term 查询下:

    {
      "query": {
        "term": {
          "title": "love"
        }
      }
    }

    结果是,上面的两条数据都能查询到:

    {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 0.6931472,
        "hits": [
          {
            "_index": "test",
            "_type": "doc",
            "_id": "8",
            "_score": 0.6931472,
            "_source": {
              "title": "love HuBei",
              "content": "people very love HuBei",
              "tags": ["HuBei","love"]
            }
          },
          {
            "_index": "test",
            "_type": "doc",
            "_id": "7",
            "_score": 0.6931472,
            "_source": {
              "title": "love China",
              "content": "people very love China",
              "tags": ["China","love"]
            }
          }
        ]
      }
    }

    发现,title里有关love的关键字都查出来了,但是我只想精确匹配 love China这个,按照下面的写法看看能不能查出来:

    {
      "query": {
        "term": {
          "title": "love China"
        }
      }
    }

    执行发现无数据,从概念上看,term属于精确匹配,只能查单个词。我想用term匹配多个词怎么做?可以使用terms来:

    {
      "query": {
        "terms": {
          "title": ["love", "China"]
        }
      }
    }

    查询结果为:

    {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 0.6931472,
        "hits": [
          {
            "_index": "test",
            "_type": "doc",
            "_id": "8",
            "_score": 0.6931472,
            "_source": {
              "title": "love HuBei",
              "content": "people very love HuBei",
              "tags": ["HuBei","love"]
            }
          },
          {
            "_index": "test",
            "_type": "doc",
            "_id": "7",
            "_score": 0.6931472,
            "_source": {
              "title": "love China",
              "content": "people very love China",
              "tags": ["China","love"]
            }
          }
        ]
      }
    }

    发现全部查询出来,为什么?因为terms里的[ ] 多个是或者的关系,只要满足其中一个词就可以。想要通知满足两个词的话,就得使用bool的must来做,如下:

    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "title": "love"
              }
            },
            {
              "term": {
                "title": "china"
              }
            }
          ]
        }
      }
    }
    可以看到,我们上面使用china是小写的。当使用的是大写的China 我们进行搜索的时候,发现搜不到任何信息。这是为什么了?title这个词在进行存储的时候,进行了分词处理。我们这里使用的是默认的分词处理器进行了分词处理。我们可以看看如何进行分词处理的?

    分词处理器

    GET test/_analyze
    {
      "text" : "love China"
    }

    结果为:

    {
      "tokens": [
        {
          "token": "love",
          "start_offset": 0,
          "end_offset": 4,
          "type": "<ALPHANUM>",
          "position": 0
        },
        {
          "token": "china",
          "start_offset": 5,
          "end_offset": 10,
          "type": "<ALPHANUM>",
          "position": 1
        }
      ]
    }

    分析出来的为lovechina的两个词。而term只能完完整整的匹配上面的词,不做任何改变的匹配。所以,我们使用China这样的方式进行的查询的时候,就会失败。稍后会有一节专门讲解分词器。

    match用法

    先用 love China来匹配。

    GET test/doc/_search
    {
      "query": {
        "match": {
          "title": "love China"
        }
      }
    }

    结果是:

    {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 1.3862944,
        "hits": [
          {
            "_index": "test",
            "_type": "doc",
            "_id": "7",
            "_score": 1.3862944,
            "_source": {
              "title": "love China",
              "content": "people very love China",
              "tags": [
                "China",
                "love"
              ]
            }
          },
          {
            "_index": "test",
            "_type": "doc",
            "_id": "8",
            "_score": 0.6931472,
            "_source": {
              "title": "love HuBei",
              "content": "people very love HuBei",
              "tags": [
                "HuBei",
                "love"
              ]
            }
          }
        ]
      }
    }
    发现两个都查出来了,为什么?因为match进行搜索的时候,会先进行分词拆分,拆完后,再来匹配,上面两个内容,他们title的词条为: love china hubei ,我们搜索的为love China 我们进行分词处理得到为love china ,并且属于或的关系,只要任何一个词条在里面就能匹配到。如果想 loveChina 同时匹配到的话,怎么做?使用 match_phrase

    match_phrase 用法

    match_phrase 称为短语搜索,要求所有的分词必须同时出现在文档中,同时位置必须紧邻一致。

    GET test/doc/_search
    {
      "query": {
        "match_phrase": {
          "title": "love china"
        }
      }
    }

    结果为:

    {
      "took": 5,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1,
        "max_score": 1.3862944,
        "hits": [
          {
            "_index": "test",
            "_type": "doc",
            "_id": "7",
            "_score": 1.3862944,
            "_source": {
              "title": "love China",
              "content": "people very love China",
              "tags": [
                "China",
                "love"
              ]
            }
          }
        ]
      }
    }

    这次好像符合我们的需求了,结果只出现了一条记录。

     原文链接:https://www.jianshu.com/p/d5583dff4157

  • 相关阅读:
    Mvc+三层(批量添加、删除、修改)
    js中判断复选款是否选中
    EF的优缺点
    Git tricks: Unstaging files
    Using Git Submodules
    English Learning
    wix xslt for adding node
    The breakpoint will not currently be hit. No symbols have been loaded for this document."
    Use XSLT in wix
    mfc110ud.dll not found
  • 原文地址:https://www.cnblogs.com/chong-zuo3322/p/14031602.html
Copyright © 2011-2022 走看看