zoukankan      html  css  js  c++  java
  • Elasticsearch之高亮查询,聚合查询

    Elasticsearch之高亮查询

    一 前言

    如果返回的结果集中很多符合条件的结果,那怎么能一眼就能看到我们想要的那个结果呢?比如下面网站所示的那样,我们搜索elasticsearch,在结果集中,将所有elasticsearch高亮显示?

    06119F24-7838-43D8-84EE-F20B929C16B7

    如上图我们搜索百度一样。我们该怎么做呢?

    二 准备数据

    PUT lqz/doc/4
    {
      "name":"石头",
      "age":29,
      "from":"gu",
      "desc":"粗中有细,狐假虎威",
      "tags":["", "",""]
    }

    三 默认高亮显示

    我们来查询:

    GET lqz/doc/_search
    {
      "query": {
        "match": {
          "name": "石头"
        }
      },
      "highlight": {
        "fields": {
          "name": {}
        }
      }
    }
    
    #我们使用highlight属性来实现结果高亮显示,需要的字段名称添加到fields内即可,elasticsearch会自动帮我们实现高亮。
    结果如下:
    
    {
      "took" : 1,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 1,
        "max_score" : 1.5098256,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "4",
            "_score" : 1.5098256,
            "_source" : {
              "name" : "石头",
              "age" : 29,
              "from" : "gu",
              "desc" : "粗中有细,狐假虎威",
              "tags" : [
                "",
                "",
                ""
              ]
            },
            "highlight" : {
              "name" : [
                "<em>石</em><em>头</em>"
              ]
            }
          }
        ]
      }
    }
    查询结果

    上例中,elasticsearch会自动将检索结果用标签包裹起来,用于在页面中渲染。

    四 自定义高亮显示

    GET lqz/chengyuan/_search
    {
      "query": {
        "match": {
          "from": "gu"
        }
      },
      "highlight": {
        "pre_tags": "<b class='key' style='color:red'>",
        "post_tags": "</b>",
        "fields": {
          "from": {}
        }
      }
    }
    上例中,在highlight中,pre_tags用来实现我们的自定义标签的前半部分,在这里,我们也可以为自定义的标签添加属性和样式。post_tags实现标签的后半部分,组成一个完整的标签。至于标签中的内容,则还是交给fields来完成。
    {
      "took" : 1,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 1,
        "max_score" : 0.5753642,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "chengyuan",
            "_id" : "1",
            "_score" : 0.5753642,
            "_source" : {
              "name" : "老二",
              "age" : 30,
              "sex" : "male",
              "birth" : "1070-10-11",
              "from" : "gu",
              "desc" : "皮肤黑,武器长,性格直",
              "tags" : [
                "",
                "",
                ""
              ]
            },
            "highlight" : {
              "name" : [
                "<b class='key' style='color:red'>老</b><b class='key' style='color:red'>二</b>"
              ]
            }
          }
        ]
      }
    }
    查询结果

    需要注意的是:自定义标签中属性或样式中的逗号一律用英文状态的单引号表示,应该与外部elasticsearch语法的双引号区分开

    前后端分离,你怎么处理?把<b class='key' style='color:red'>串直接以json格式返回,前端自行渲染

    Elasticsearch之聚合查询

    • avg

    • max

    • min

    • sum

    avg

    # 查询`from`是`gu`的人的平均年龄。
    # select max(age) as my_avg from user;
    
    GET lqz/doc/_search
    {
      "query": {
        "match": {
          "from": "gu"
        }
      },
      "aggs": {
        "my_avg": {
          "avg": {
            "field": "age"
          }
        }
      },
      "_source": ["name", "age"]
    }

    上例中,首先匹配查询fromgu的数据。在此基础上做查询平均值的操作,这里就用到了聚合函数,其语法被封装在aggs中,而my_avg则是为查询结果起个别名,封装了计算出的平均值。那么,要以什么属性作为条件呢?是age年龄,查年龄的什么呢?是avg,查平均年龄。

    {
      "took" : 1,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 3,
        "max_score" : 0.6931472,
        "hits" : [
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "4",
            "_score" : 0.6931472,
            "_source" : {
              "name" : "石头",
              "age" : 29
            }
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "1",
            "_score" : 0.2876821,
            "_source" : {
              "name" : "顾老二",
              "age" : 30
            }
          },
          {
            "_index" : "lqz",
            "_type" : "doc",
            "_id" : "3",
            "_score" : 0.2876821,
            "_source" : {
              "name" : "龙套偏房",
              "age" : 22
            }
          }
        ]
      },
      "aggregations" : {
        "my_avg" : {
          "value" : 27.0
        }
      }
    }
    查询结果

    上例中,在查询结果的最后是平均值信息,可以看到是27岁。

    虽然我们已经使用_source对字段做了过滤,但是还不够。我不想看都有哪些数据,只想看平均值怎么办?别忘了size!

    GET lqz/doc/_search
    {
      "query": {
        "match": {
          "from": "gu"
        }
      },
      "aggs": {
        "my_avg": {
          "avg": {
            "field": "age"
          }
        }
      },
      "size": 0, 
      "_source": ["name", "age"]
    }

    上例中,只需要在原来的查询基础上,增加一个size就可以了,输出几条结果,我们写上0,就是输出0条查询结果。

    {
      "took" : 8,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : 3,
        "max_score" : 0.0,
        "hits" : [ ]
      },
      "aggregations" : {
        "my_avg" : {
          "value" : 27.0
        }
      }
    }
    查询结果

    max

    GET lqz/doc/_search
    {
      "query": {
        "match": {
          "from": "gu"
        }
      },
      "aggs": {
        "my_max": {
          "max": {
            "field": "age"
          }
        }
      },
      "size": 0
    }

    上例中,只需要在查询条件中将avg替换成max即可。

    min

    GET lqz/doc/_search
    {
      "query": {
        "match": {
          "from": "gu"
        }
      },
      "aggs": {
        "my_min": {
          "min": {
            "field": "age"
          }
        }
      },
      "size": 0
    }

    sum

    # 求年龄总和
    GET lqz/doc/_search { "query": { "match": { "from": "gu" } }, "aggs": { "my_sum": { "sum": { "field": "age" } } }, "size": 0 }

    分组查询

    现在我想要查询所有人的年龄段,并且按照15~20,20~25,25~30分组,并且算出每组的平均年龄。

    GET lqz/doc/_search
    {
      "size": 0, 
      "query": {
        "match_all": {}
      },
      "aggs": {
        "age_group": {
          "range": {
            "field": "age",
            "ranges": [
              {
                "from": 15,
                "to": 20
              },
              {
                "from": 20,
                "to": 25
              },
              {
                "from": 25,
                "to": 30
              }
            ]
          },
          "aggs": {
            "my_avg": {
              "avg": {
                "field": "age"
              }
            }
          }
        }
      }
    }
    {
     "took" : 1,
     "timed_out" : false,
     "_shards" : {
       "total" : 5,
       "successful" : 5,
       "skipped" : 0,
       "failed" : 0
     },
     "hits" : {
       "total" : 5,
       "max_score" : 0.0,
       "hits" : [ ]
     },
     "aggregations" : {
       "age_group" : {
         "buckets" : [
           {
             "key" : "15.0-20.0",
             "from" : 15.0,
             "to" : 20.0,
             "doc_count" : 1,
             "my_avg" : {
               "value" : 18.0
             }
           },
           {
             "key" : "20.0-25.0",
             "from" : 20.0,
             "to" : 25.0,
             "doc_count" : 1,
             "my_avg" : {
               "value" : 22.0
             }
           },
           {
             "key" : "25.0-30.0",
             "from" : 25.0,
             "to" : 30.0,
             "doc_count" : 2,
             "my_avg" : {
               "value" : 27.0
             }
           }
         ]
       }
     }
    }
    查询结果

    上例中,在aggs的自定义别名age_group中,使用range来做分组,field是以age为分组,分组使用ranges来做,fromto是范围,我们根据需求做出三组。在分组下面,我们使用aggsage做平均数处理,这样就可以了。返回的结果中可以看到,已经拿到了三个分组。doc_count为该组内有几条数据,此次共分为三组,查询出4条内容。还有一条数据的age属性值是30,不在分组的范围内!

    注意:聚合函数的使用,一定是先查出结果,然后对结果使用聚合函数做处理

  • 相关阅读:
    echarts各个配置项详细说明总结
    享元模式
    观察者模式
    策略模式
    桥接模式
    适配器模式
    建造者模式
    原型模式
    单例模式
    Java8新特性——集合底层源码实现的改变
  • 原文地址:https://www.cnblogs.com/baohanblog/p/12852658.html
Copyright © 2011-2022 走看看