zoukankan      html  css  js  c++  java
  • elasticsearch常用查询

    query DSL
    match 查询
    { 
        "match": { 
            "tweet": "About Search" 
        } 
    }
    注:match查询只能就指定某个确切字段某个确切的值进行搜索,做精确匹配搜索时,
    你最好用过滤语句,因为过滤语句可以缓存数据。
    
    match_phrase 查询
    {
        "query": {
            "match_phrase": {
                "title": "quick brown fox"
            }
        }
    }
    注:与match相比,不会拆分查询条件
    参考:https://blog.csdn.net/liuxiao723846/article/details/78365078?locationNum=2&fps=1
    
    
    multi_match查询:对多个field查询
    { 
        "multi_match": { 
            "query":    "full text search", 
            "fields":   [ "title", "body" ] 
        } 
    }
    
    bool 查询
    must: 查询指定文档一定要被包含。
    filter: 和must类似,但不计分。
    must_not: 查询指定文档一定不要被包含。
    should: 查询指定文档,满足一个条件就返回。
    POST _search
    {
      "query": {
        "bool" : {
          "must" : {
            "term" : { "user" : "kimchy" }
          },
          "filter": {
            "term" : { "tag" : "tech" }
          },
          "must_not" : {
            "range" : {
              "age" : { "gte" : 10, "lte" : 20 }
            }
          },
          "should" : [
            { "term" : { "tag" : "wow" } },
            { "term" : { "tag" : "elasticsearch" } }
          ],
          "minimum_should_match" : 1,
          "boost" : 1.0
        }
      }
    }
    
    prefix 查询:以什么字符开头
    { 
      "query": { 
        "prefix": { 
          "hostname": "wx" 
        } 
      } 
    }
    
    wildcards 查询:通配符查询
    {
        "query": {
            "wildcard": {
                "postcode": "W?F*HW" 
            }
        }
    }
    注:?用来匹配任意字符,*用来匹配零个或者多个字符
    
    regexp 查询:正则表达式查询
    {
        "query": {
            "regexp": {
                "postcode": "W[0-9].+" 
            }
        }
    }
    
    注:
    1. 字段、词条
    字段"Quick brown fox" 会产生
    词条"quick","brown"和"fox"
    2. prefix,wildcard以及regexp查询基于词条
    ---------------------------------------------------------------------
    
    filter DSL
    term 过滤:精确匹配
    { 
      "query": { 
        "term": { 
          "age": 26
        } 
      } 
    }
    
    terms 过滤:指定多个匹配条件
    { 
      "query": { 
        "terms": { 
          "status": [ 304, 302 ] 
        } 
      } 
    }
    
    range 过滤
    { 
        "range": { 
            "age": { 
                "gte":  20, 
                "lt":   30 
            } 
        } 
    }
    
    范围操作符包含:
    gt :: 大于
    gte:: 大于等于
    lt :: 小于
    lte:: 小于等于
    
    exists/missing 过滤:过滤字段是否存在
    { 
        "exists":   { 
            "field":    "title" 
        } 
    }
    
    bool过滤:合并多个过滤条件查询结果的布尔逻辑
    { 
        "bool": { 
            "must":     { "term": { "folder": "inbox" }}, 
            "must_not": { "term": { "tag":    "spam"  }}, 
            "should": [ 
                        { "term": { "starred": true   }}, 
                        { "term": { "unread":  true   }} 
            ] 
        } 
    }
    注:
    must :: 多个查询条件的完全匹配,相当于 and。
    must_not :: 多个查询条件的相反匹配,相当于 not。
    should :: 至少有一个查询条件匹配, 相当于 or。
    must例子,多个条件使用[]
    {
        "query": {
            "bool": {
                "must": [{
                        "term": {
                            "category": "38"
                        }
                    },
                    {
                        "range": {
                            "time": {
                                "gte": "1539827880",
                                "lt": "1539827881"
                            }
                        }
                    }
                ],
                "must_not": [],
                "should": []
            }
        },
        "from": 0,
        "size": 10,
        "sort": [],
        "aggs": {}
    }

    聚合:

    1. 筛选出指定时间的记录并求出sum(collection),其中collection字段为数值
    {
        "query" : {
            "match" : {"time":1539827880}
        },
        "aggs" : {
            "connections" : { "sum" : { "field" : "connection" } }
        }
    }
    2. 筛选出时间范围内的数据然后根据category进行分类,每个类别中计算connection的总数,并根据result排序
    {
        "query": {
            "range": {
                "time": {
                    "gte": 1539827880,
                    "lt": 1539827881
                }
            }
        },
        "aggs": {
            "_result": {
                "terms": {
                    "field": "category",
    "order": {"result":"desc"}
    }, "aggs": { "result": { "sum": { "field": "connection" } } } } } }
    3. 多重聚合
    {
        "size": 0,
        "query": {
          "range": {
            "time": {
              "gte": 1539741480,
              "lte": 1539827880
            }
          }
        },
        "aggs": {
          "_result": {
            "terms": {
              "field": "category",
              "order": {
                "result": "desc"
              },
              "size": 5
            },
            "aggs": {
              "result": {
                "sum": {
                  "field": "connection"
                }
              },
              "IPs": {
                "terms": {
                  "field": "ip",
                  "order": {
                    "ip_result" : "desc"
                  },
                  "size": 5
                },
                "aggs": {
                  "ip_result": {
                  "sum": {
                    "field": "connection"
                  }
                }
              }
            }
          }
        }
      }
    } 
    #DE log,先筛选type=2的日志,然后根据source_address统计repeat_times的和(即该source_address的log出现的次数),倒排
    {
      "size": 0,
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "type": 2 
              }
            }
          ]
        }
      },
      "aggs": {
        "all_source_address": {
            "terms": {
                "field": "source_address",
                "order": {"repeat_times_total": "desc"}  #注意:根据聚合后的字段来排序
            },
            "aggs": {
                "repeat_times_total": {
                    "sum": {"field": "repeat_times"}
                }
            }
        }
      }
    }
  • 相关阅读:
    1. Deep Q-Learning
    Ⅶ. Policy Gradient Methods
    Ⅴ. Temporal-Difference Learning
    idea在service窗口中显示多个服务
    pycharm下运行flask框架的脚本时报错
    windows下部署 flask (win10+flask+nginx)
    pip install selenium报错解决方法
    pip 及 selenium安装命令
    动作捕捉系统用于模仿学习
    柔性微创手术机器人性能实验验证
  • 原文地址:https://www.cnblogs.com/stellar/p/9825598.html
Copyright © 2011-2022 走看看