zoukankan html css js c++ java

DSL

1.定义

DSL查询语句以json来定义，包含两种语法：

叶子查询：例如match，term，range

复合查询：组合多条件查询，例如bool，dis_max

2.分类

查询语句的行为建立在它是在‘’查询条件句‘’中还是在‘’过滤条件‘’中，

2.1查询条件句

查询出来的每个结果都会有一个评分，评分越高越符合查询条件，用来解释how well does this document match this query clause?

2.2过滤条件句

过滤出来的结果就是符合或者不符合，用来解释Does this document match this query clause?经常使用的过滤器会被弹性搜索自动缓存，以提高性能。

GET /_search
{
  "query": { 
    "bool": { 
      "must": [
        { "match": { "title":   "Search"        }},     #title字段包含search单词
        { "match": { "content": "Elasticsearch" }}  
      ],
      "filter": [ 
        { "term":  { "status": "published" }},     #status字段包含严格的published单词
        { "range": { "publish_date": { "gte": "2015-01-01" }}}   #publish_date字段大于2015-01-01
      ]
    }
  }
}

3.查询条件句

3.1 查询所有：

GET /_search
{
    "query": {
        "match_all": {}
    }
}

3.2 全文查询

3.2.1 match

单个词的查询

GET /_search
{
    "query": {
        "match" : {
            "message" : "this is a test"
        }
    }
}

match查询是boolean型的，可设置operator字段为or或者and来控制boolean语句，默认为or。也就是说上面这个例子message中包含this is a test默认搜索出任何一个字符都符合查询条件，因为他们是or的关系

GET /_search
{
    "query": {
        "match" : {
            "message" : {
                "query" : "this is a test",
                "operator" : "and"
            }
        }
    }
}

这个例子中operator定义成and关系，也就是说message字段有 this字符也有is字符也有a、test字符，但是并不表示他们一定要连在一起，要查询的文本只要有这四个字符就满足条件。

3.2.2 match_phrase

构建短语查询

GET /_search
{
    "query": {
        "match_phrase" : {
            "message" : "this is a test"
        }
    }
}

查询这个短语this is a test

GET /_search
{
    "query": {
        "match_phrase" : {
            "message" : {
                "query" : "this is a test",
                "slop":"1",     #一个整数值，构建短语时，中间允许的未知词条数，默认0
                "analyzer" : "my_analyzer"  #分析时用到的分析器
            }
        }
    }
}

3.2.3 match_phrase_prefix

与match_phrase类似

GET /_search
{
    "query": {
        "match_phrase_prefix" : {
            "message" : "quick brown f"
        }
    }
}

添加了两个参数

GET /_search
{
    "query": {
        "match_phrase_prefix" : {
            "message" : {
                "query" : "quick brown f",  #允许最后一个字条只做前缀匹配
                "max_expansions" : 10, #控制最后一个单词会被重写成多少个前缀
                "slop": 1,
                "analyzer": "xxx"
            }
        }
    }
}

此时的quick brown是一个短语，必须连在一起查询f作为一个单词的前缀，例如 quick brown from会被查询到，quick brown flame会被查询到，控制10个这样的短语。意味着先查询quick brown这样的词，排序完成后，找出前10个quick borwn后面首字母带f的词组。

3.2.4 multi_match

GET /_search
{
  "query": {
    "multi_match" : {
      "query":    "this is a test",  #分词this、is、a、test
      "fields": [ "subject", "message" ]   #在参数subject和参数message中分别挨个查询query分词
    }
  }
}

fields字段支持通配符，例如"fields" : ["title","*_name"]

fields字段的单个字符可以设置重要程度，例如"fields" : [ "subject^3", "message" ] subject字段比message字段重要3倍

如果没有提供fields字段，则默认匹配所有参数字段

可添加operator参数，默认值是or，如果设置为and，那么subject字段中必须同时含有query中的各个分词。

当operator使用默认值时，参数minimum_should_match设置每个子查询应该匹配多少个分词，默认值是1,

"operator": "or"
"minimum_should_match":1   #subject字段中至少含有query中的分词中的一个

可添加type类型，默认为best_fields类型

3.2.4.1 best_fields类型

这是type默认类值，从指定的字段中匹配查询，每个fields中的字段都计算评分（_score），返回最高的评分

GET /_search
{
  "query": {
    "multi_match" : {
      "query":      "brown fox",
      "type":       "best_fields",    #
      "fields":     [ "subject", "message" ], #为每个fields中的参数分配query的值，看哪个field参数最符合，返回评分最高的那个
      "tie_breaker": 0.3   #当指定tie_breaker时，给除了最高分的field外的其他fields加0.3分
    }
  }
}

跟best_fields类型相同的查询类型是dis_max，dis_max查询类型有一个子查询数组，每一个子查询都单独计算评分，返回子查询中最高的评分。

GET /_search
{
  "query": {
    "dis_max": {
      "queries": [   #
        { "match": { "subject": "brown fox" }},   #每一个子查询都单独计算评分
        { "match": { "message": "brown fox" }}
      ],
      "tie_breaker": 0.3
    }
  }
}

opertor例子

GET /_search
{
  "query": {
    "multi_match" : {
      "query":      "Will Smith",
      "type":       "best_fields",
      "fields":     [ "first_name", "last_name" ],
      "operator":   "and" #将会这样查询-(+first_name:will + first_name:smith)| (+last_name:will  + last_name:smith
    }
  }
}

query中的所有字段必须都在参数中出现

3.2.4.2 most_fields类型

从指定的字段中匹配查询，每个字段都计算评分(_score)，最后把每个字段的评分合并在一起，求平均分。

GET /_search
{
  "query": {
    "multi_match" : {
      "query":      "quick brown fox",  #
      "type":       "most_fields",
      "fields":     [ "title", "title.original", "title.shingles" ]  #original和shingles是title下面的两个元素
    }
  }
}

执行起来像下面这样

GET /_search
{
  "query": {
    "bool": {
      "should": [
        { "match": { "title":          "quick brown fox" }},
        { "match": { "title.original": "quick brown fox" }},
        { "match": { "title.shingles": "quick brown fox" }}
      ]
    }
  }
}

从3个match中得出的得分相加后求平均

3.2.4.3 phrase和phrase_prefix

这两个类型类似best_fields，在每个字段上执行查询，然后返回最高的评分，但是他们用match_phrase或match_phrase_prefix代替match

GET /_search
{
  "query": {
    "multi_match" : {
      "query":      "quick brown f",
      "type":       "phrase_prefix",
      "fields":     [ "subject", "message" ]
    }
  }
}

执行的时候类似于

GET /_search
{
  "query": {
    "dis_max": {
      "queries": [
        { "match_phrase_prefix": { "subject": "quick brown f" }},
        { "match_phrase_prefix": { "message": "quick brown f" }}
      ]
    }
  }
}

3.2.4.4 cross_fields类型

该查询类型是把query条件拆分成各个分词，然后在各个字段上执行匹配分词，默认情况下，只要有一个字段匹配，那么返回文档。

{
  "multi_match" : {
    "query":      "Will Smith",
    "type":       "cross_fields",
    "fields":     [ "first_name", "last_name" ],
    "operator":   "and"
  }
}

query参数拆分成will和smith两个分词，当参数operator为and时，字段first_name或last_name必须包含will ，并且 first_name或last_name必须包含smith。

如果参数operator为or，字段first_name或last_name必须包含will ，或者 first_name或last_name必须包含smith，其等价的逻辑是，只要字段 first_name或last_name中包含 will或smith就返回文档。

3.2.5 common terms query

3.2.6 query string query

3.2.7 simple query string query

3.3 term查询

3.4 复合查询

3.4.1 bool查询

布尔查询是最常用的组合查询，不仅将多个查询条件组合在一起，并且将查询的结果和结果的评分组合在一起。当查询条件是多个表达式的组合时，布尔查询非常有用，实际上，布尔查询把多个子查询组合（combine）成一个布尔表达式，所有子查询之间的逻辑关系是与（and）；只有当一个文档满足布尔查询中的所有子查询条件时，ElasticSearch引擎才认为该文档满足查询条件。布尔查询支持的子查询类型共有四种，分别是：must，should，must_not和filter：

must子句：文档必须匹配must查询条件；

should子句：文档应该匹配should字句查询的一个或多个；

must_not子句：文档不能匹配该查询条件

filter子句：过滤器，文档必须匹配该过滤条件，跟must子句的唯一区别是，filter不影响查询的评分 score

通常情况下，should子句是数组字段，包含多个should子查询，默认情况下，匹配的文档必须满足其中一个子查询条件。如果查询需要改变默认匹配行为，查询DSL必须显式设置布尔查询的参数minimum_should_match的值，该参数控制一个文档必须匹配的should子查询的数量，我遇到一个布尔查询语句，其should子句中包含两个查询，如果不设置参数minimum_should_match，其默认值是0。建议在布尔查询中，显示设置参数minimum_should_match的值。

注：布尔查询的四个子句，都可以是数组字段，因此，支持嵌套逻辑操作的查询。

"should" : [
        {  "term" : { "tag" : "azure" } },
        {  "term" : { "tag" : "elasticsearch" } },
        {  "term" : { "tag" : "cloud" } }
    ],
"minimum_should_match" : 2  #一个文档必须满足should子句中两个以上的词条查询

布尔查询的各个子句之间的逻辑关系是与（and），这意味着，一个文档只有同时满足所有的查询子句时，该文档才匹配查询条件，作为结果返回。

在布尔查询中，对查询结果的过滤，建议使用过滤（filter）子句和must_not子句，这两个子句属于过滤上下文（Filter Context），经常使用filter子句，使得ElasticSearch引擎自动缓存数据，当再次搜索已经被缓存的数据时，能够提高查询性能；由于过滤上下文不影响查询的评分，而评分计算让搜索变得复杂，消耗更多CPU资源，因此，filter和must_not查询减轻搜索的工作负载。

查询和过滤上下文

在布尔查询中，查询被分为Query Context 和 Filter Context，查询上下文由query参数指定，过滤上下文由filter和must_not参数指定。这两个查询上下文的唯一区别是：Filter Context不影响查询的评分（score）。在布尔查询中，Filter参数和must_not参数使用Filter Context，而must和should使用Query Context，经常使用Filter Context，引擎会自动缓存数据，提高查询性能。

GET _search
{
  "query": { 
    "bool": { 
      "must": [  #must子句处于query context中
        { "match": { "title":   "Search"        }}, #在query context中，must子句将返回同时满足匹配(match)查询的文档
        { "match": { "content": "Elasticsearch" }}  
      ],
      "filter": [ #filter子句处于filter context中
        { "term":  { "status": "published" }},   #在filter context中，filter子句是一个过滤器，将不满足词条查询和范围查询条件的文档过滤掉，并且不影响匹配文档的score；
        { "range": { "publish_date": { "gte": "2015-01-01" }}} 
      ]
    }
  }
}

布尔查询子句的逻辑关系

{
    "bool" : {  #在布尔查询中，各个子句之间的逻辑关系是与（and），都为true则返回true
        "must" : {
            "term" : { "user" : "kimchy" }
        },
        "filter": {
            "term" : { "tag" : "tech" }
        },
        "must_not" : {
            "range" : {
                "age" : { "from" : 10, "to" : 20 }
            }
        },
        "should" : [
            {  "term" : { "tag" : "wow" } },
            {  "term" : { "tag" : "elasticsearch" } }
        ],
        "minimum_should_match" : 1 #should中只要满足一个词条查询的条件，should字句就匹配成功，返回true，再匹配其他条件
    }
}

复杂逻辑

{
  "_source": "topics",
  "from": 0,
  "size": 100,
  "query": {
    "bool": {
      "should": [
       {
          "bool": {
            "must": [
              { "term": { "topics": 1}  },
              { "term": { "topics": 2}  }
            ]
          }
        },
        {
          "bool": {
            "must": [
              {"term": { "topics": 3 } },
              {"term": { "topics": 4}}
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  }

精准查询

1，首先确定下您的mapping是怎么建立的，若name字段是analyze的那么则用如下语句
{ "query": { "term": { "name.keyword": "知乎" } } }

2，若name是not_analyzed的，则直接如下即可
{ "query": { "term": { "name": "知乎" } } }

查询组

用terms代替term可以在一个列表中进行搜索，例如

{terms: {"hostname": [xxx,xxxx,xxx,x] }} #会在hostname字段给出的列表中的值中搜索

查看全文

相关阅读:
csharp上传文件到服务器指定文件夹问题
 c#上传文件到服务器指定文件夹问题
 jsp上传文件到服务器指定文件夹问题
 java上传文件到服务器指定文件夹问题
 asp.net上传文件到服务器指定文件夹问题
 .net批量下载图片
 连续肝了好几天，终于把Java面试必备最重要的基础知识【Java集合】知识点总结整理出来了。
来晚了，秋招五投大厂，成功拿下三家Offer，最终入职美团，分享我的美团1-4面（Java岗）
变秃了，也变强了！爆肝吐血整理出的超硬核JVM笔记分享！
Spring框架及七大模块

原文地址：https://www.cnblogs.com/zz27zz/p/8724030.html