zoukankan      html  css  js  c++  java
  • ES查询-match VS match_phrase

    我们以一个查询的示例开始,我们在student这个type中存储了一些学生的基本信息,我们分别使用match和match_phrase进行查询。

    首先,使用match进行检索,关键字是“He is”:

    GET /test/student/_search
    {
      "query": {
        "match": {
          "description": "He is"
        }
      }
    }

    执行这条查询,得到的结果如下:

    {
       "took": 3,
       "timed_out": false,
       "_shards": {
          "total": 5,
          "successful": 5,
          "failed": 0
       },
       "hits": {
          "total": 4,
          "max_score": 0.2169777,
          "hits": [
             {
                "_index": "test",
                "_type": "student",
                "_id": "2",
                "_score": 0.2169777,
                "_source": {
                   "name": "februus",
                   "sex": "male",
                   "age": 24,
                   "description": "He is passionate.",
                   "interests": "reading, programing"
                }
             },
             {
                "_index": "test",
                "_type": "student",
                "_id": "1",
                "_score": 0.16273327,
                "_source": {
                   "name": "leotse",
                   "sex": "male",
                   "age": 25,
                   "description": "He is a big data engineer.",
                   "interests": "reading, swiming, hiking"
                }
             },
             {
                "_index": "test",
                "_type": "student",
                "_id": "4",
                "_score": 0.01989093,
                "_source": {
                   "name": "pascal",
                   "sex": "male",
                   "age": 25,
                   "description": "He works very hard because he wanna go to Canada.",
                   "interests": "programing, reading"
                }
             },
             {
                "_index": "test",
                "_type": "student",
                "_id": "3",
                "_score": 0.016878016,
                "_source": {
                   "name": "yolovon",
                   "sex": "female",
                   "age": 24,
                   "description": "She is so charming and beautiful.",
                   "interests": "reading, shopping"
                }
             }
          ]
       }
    }

    而当你执行match_phrase时:

    GET /test/student/_search
    {
      "query": {
        "match_phrase": {
          "description": "He is"
        }
      }
    }

    结果如下:

    {
       "took": 3,
       "timed_out": false,
       "_shards": {
          "total": 5,
          "successful": 5,
          "failed": 0
       },
       "hits": {
          "total": 2,
          "max_score": 0.30685282,
          "hits": [
             {
                "_index": "test",
                "_type": "student",
                "_id": "2",
                "_score": 0.30685282,
                "_source": {
                   "name": "februus",
                   "sex": "male",
                   "age": 24,
                   "description": "He is passionate.",
                   "interests": "reading, programing"
                }
             },
             {
                "_index": "test",
                "_type": "student",
                "_id": "1",
                "_score": 0.23013961,
                "_source": {
                   "name": "leotse",
                   "sex": "male",
                   "age": 25,
                   "description": "He is a big data engineer.",
                   "interests": "reading, swiming, hiking"
                }
             }
          ]
       }
    }

    占的篇幅有点长,但是如果能基于此看清这两者之间的区别,那也是值得的。

    我们分析一下这两者结果的差别:

    1.非常直观的一点,对于同一个数据集,两者检索出来的结果集数量不一样;
    2.对于match的结果,我们可以可以看到,结果的Document中description这个field可以包含“He is”,“He”或者“is”;
    3.match_phrase的结果中的description字段,必须包含“He is”这一个词组;
    4.所有的检索结果都有一个_score字段,看起来是当前这个document在当前搜索条件下的评分,而检索结果也是按照这个得分从高到低进行排序。
           我们要想弄清楚match和match_phrase的区别,要先回到他们的用途:match是全文搜索,也就是说这里的搜索条件是针对这个字段的全文,只要发现和搜索条件相关的Document,都会出现在最终的结果集中,事实上,ES会根据结果相关性评分来对结果集进行排序,这个相关性评分也就是我们看到的_score字段;总体上看,description中出现了“He is”的Document的相关性评分高于只出现“He”或“is”的Document。(至于怎么给每一个Document评分,我们会在以后介绍)。
    相关性(relevance)的概念在Elasticsearch中非常重要,而这个概念在传统关系型数据库中是不可想象的,因为传统数据库对记录的查询只有匹配或者不匹配。

    那么,如果我们不想将我们的查询条件拆分,应该怎么办呢?这时候我们就可以使用match_phrase:
    match_phrase是短语搜索,亦即它会将给定的短语(phrase)当成一个完整的查询条件。当使用match_phrase进行搜索的时候,你的结果集中,所有的Document都必须包含你指定的查询词组,在这里是“He is”。这看起来有点像关系型数据库的like查询操作。

     

  • 相关阅读:
    输入npm install 报错node-sass@4.13.0 postinstall:`node scripts/build.js` Failed at the node-sass@4.13.0
    二、vue组件化开发(轻松入门vue)
    一、vue基础语法(轻松入门vue)
    (三)ES6基础语法。。。freecodecamp笔记
    (一)响应式web设计。。。freecodecamp笔记
    LeetCode42. 接雨水(java)
    (5)air202读取串口数据并上传到阿里云显示
    (四)HXDZ-30102-ACC检测心率血氧数据并通过串口助手显示
    (三)air202连接阿里云上传静态数据
    2016-2017 National Taiwan University World Final Team Selection Contest A
  • 原文地址:https://www.cnblogs.com/loveyouyou616/p/10364082.html
Copyright © 2011-2022 走看看