zoukankan      html  css  js  c++  java
  • search(14)- elastic4s-统计范围:global, filter,post-filter bucket

      聚合一般作用在query范围内。不带query的aggregation请求实际上是在match_all{}查询范围内进行统计的:

    GET /cartxns/_search
    {
      "aggs": {
        "all_colors": {
          "terms": {"field" : "color.keyword"}
        }
      }
     }
    }
    
    GET /cartxns/_search
    {
      "query": {
        "match_all": {}
      }, 
      "aggs": {
        "all_colors": {
          "terms": {"field" : "color.keyword"}
        }
      }
     }
    }

    上面这两个请求结果相同:

      "aggregations" : {
        "all_colors" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [
            {
              "key" : "red",
              "doc_count" : 4
            },
            {
              "key" : "blue",
              "doc_count" : 2
            },
            {
              "key" : "green",
              "doc_count" : 2
            }
          ]
        }
      }

    虽然很多时候我们都希望在query作用域下进行统计,但也会碰到需要统计不含任何query条件的汇总数。比如在统计某个车款平价售价的同时又需要知道全部车款的平均售价。这里全部车款平价售价就是一种global bucket统计:

    GET /cartxns/_search
    {
      "query" : {
        "match" : {"make.keyword": "ford"}
      }
      , "aggs": {
        "avg_ford": {
          "avg": {
            "field": "price"
          }
        },
        "avg_all" : {
          "global": {},
          "aggs": {
            "avg_price": {
              "avg": {"field": "price"}
            }
          }
        }
        
      }
    
    }

    搜索结果和聚合结果如下:

     "hits" : {
        "total" : {
          "value" : 2,
          "relation" : "eq"
        },
        "max_score" : 1.2809337,
        "hits" : [
          {
            "_index" : "cartxns",
            "_type" : "_doc",
            "_id" : "NGVXAnIBSDa1Wo5UqLc3",
            "_score" : 1.2809337,
            "_source" : {
              "price" : 30000,
              "color" : "green",
              "make" : "ford",
              "sold" : "2014-05-18"
            }
          },
          {
            "_index" : "cartxns",
            "_type" : "_doc",
            "_id" : "OWVYAnIBSDa1Wo5UTrf8",
            "_score" : 1.2809337,
            "_source" : {
              "price" : 25000,
              "color" : "blue",
              "make" : "ford",
              "sold" : "2014-02-12"
            }
          }
        ]
      },
      "aggregations" : {
        "avg_all" : {
          "doc_count" : 8,
          "avg_price" : {
            "value" : 26500.0
          }
        },
        "avg_ford" : {
          "value" : 27500.0
        }
      }

    用elastic4s来表达:

     val aggGlob = search("cartxns").query(
        matchQuery("make.keyword","ford")
      ).aggregations(
        avgAggregation("single_avg").field("price"),
        globalAggregation("all_avg").subaggs(
            avgAggregation("avg_price").field("price")
        )
      )
      println(aggGlob.show)
    
      val globResult = client.execute(aggGlob).await
    
      if (globResult.isSuccess) {
        val gavg = globResult.result.aggregations.global("all_avg").avg("avg_price")
        val savg = globResult.result.aggregations.avg("single_avg")
        println(s"${savg.value},${gavg.value}")
        globResult.result.hits.hits.foreach(h => println(s"${h.sourceAsMap}"))
      } else println(s"error: ${globResult.error.causedBy.getOrElse("unknown")}")
    
    ...
    
    POST:/cartxns/_search?
    StringEntity({"query":{"match":{"make.keyword":{"query":"ford"}}},"aggs":{"single_avg":{"avg":{"field":"price"}},"all_avg":{"global":{},"aggs":{"avg_price":{"avg":{"field":"price"}}}}}},Some(application/json))
    27500.0,26500.0
    Map(price -> 30000, color -> green, make -> ford, sold -> 2014-05-18)
    Map(price -> 25000, color -> blue, make -> ford, sold -> 2014-02-12)

    filter-bucket的作用是:在query结果内再进行筛选后统计。比如:查询所有honda车款交易,但只统计honda某个月销售: 

    GET /cartxns/_search
    {
        "query": {
          "match": {
            "make.keyword": "honda"
          }
        },
        "aggs": {
          "sales_this_month": {
            "filter": {
              "range" : {"sold" : { "from" : "2014-10-01", "to" : "2014-11-01" }}
            },
            "aggs": {
              "month_total": {
                "sum": {"field": "price"}
              }
            }
          }
        }
    }

    首先,查询结果应该不受影响。同时还得到查询结果车款某个月的销售额:

     "hits" : {
        "total" : {
          "value" : 3,
          "relation" : "eq"
        },
        "max_score" : 0.9444616,
        "hits" : [
          {
            "_index" : "cartxns",
            "_type" : "_doc",
            "_id" : "MmVXAnIBSDa1Wo5UqLc3",
            "_score" : 0.9444616,
            "_source" : {
              "price" : 10000,
              "color" : "red",
              "make" : "honda",
              "sold" : "2014-10-28"
            }
          },
          {
            "_index" : "cartxns",
            "_type" : "_doc",
            "_id" : "M2VXAnIBSDa1Wo5UqLc3",
            "_score" : 0.9444616,
            "_source" : {
              "price" : 20000,
              "color" : "red",
              "make" : "honda",
              "sold" : "2014-11-05"
            }
          },
          {
            "_index" : "cartxns",
            "_type" : "_doc",
            "_id" : "N2VXAnIBSDa1Wo5UqLc3",
            "_score" : 0.9444616,
            "_source" : {
              "price" : 20000,
              "color" : "red",
              "make" : "honda",
              "sold" : "2014-11-05"
            }
          }
        ]
      },
      "aggregations" : {
        "sales_this_month" : {
          "doc_count" : 1,
          "month_total" : {
            "value" : 10000.0
          }
        }
      }

    elastic4s示范如下: 

      val aggfilter = search("cartxns").query(
        matchQuery("make.keyword","honda")
      ).aggregations(
        filterAgg("sales_the_month",rangeQuery("sold").gte("2014-10-01").lte("2014-11-01"))
        .subaggs(sumAggregation("monthly_sales").field("price"))
      )
      println(aggfilter.show)
    
      val filterResult = client.execute(aggfilter).await
    
      if (filterResult.isSuccess) {
        val ms = filterResult.result.aggregations.filter("sales_the_month")
                  .sum("monthly_sales").value
        println(s"${ms}")
        filterResult.result.hits.hits.foreach(h => println(s"${h.sourceAsMap}"))
      } else println(s"error: ${filterResult.error.causedBy.getOrElse("unknown")}")
    
    ...
    
    POST:/cartxns/_search?
    StringEntity({"query":{"match":{"make.keyword":{"query":"honda"}}},"aggs":{"sales_the_month":{"filter":{"range":{"sold":{"gte":"2014-10-01","lte":"2014-11-01"}}},"aggs":{"monthly_sales":{"sum":{"field":"price"}}}}}},Some(application/json))
    10000.0
    Map(price -> 10000, color -> red, make -> honda, sold -> 2014-10-28)
    Map(price -> 20000, color -> red, make -> honda, sold -> 2014-11-05)
    Map(price -> 20000, color -> red, make -> honda, sold -> 2014-11-05)

    最后一个是post-filter。post-filter同样是对query结果的筛选,但是在完成了整个query后对结果的筛选。也就是说如果query还涉及到聚合,那么聚合不受筛选影响:

    GET /cartxns/_search
    {
      "query": {
        "match": {
          "make.keyword": "ford"
        }
      },
      "post_filter": {
        "match" : {
          "color.keyword" : "blue"
        }
      }
      ,"aggs": {
        "colors": {
          "terms": {
            "field": "color.keyword",
            "size": 10
          }
        }
      }
    }

    查询和聚合结果如下:

      "hits" : {
        "total" : {
          "value" : 1,
          "relation" : "eq"
        },
        "max_score" : 1.2809337,
        "hits" : [
          {
            "_index" : "cartxns",
            "_type" : "_doc",
            "_id" : "OWVYAnIBSDa1Wo5UTrf8",
            "_score" : 1.2809337,
            "_source" : {
              "price" : 25000,
              "color" : "blue",
              "make" : "ford",
              "sold" : "2014-02-12"
            }
          }
        ]
      },
      "aggregations" : {
        "colors" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [
            {
              "key" : "blue",
              "doc_count" : 1
            },
            {
              "key" : "green",
              "doc_count" : 1
            }
          ]
        }
      }
    }

    可以看到:查询结果显示了经过post-filter筛选的结果,但聚合并没有受到filter影响。

    elastic4s示范代码:

     val aggPost = search("cartxns").query(
        matchQuery("make.keyword","ford")
      ).postFilter(matchQuery("color.keyword","blue"))
          .aggregations(
            termsAgg("colors","color.keyword")
          )
    
      println(aggPost.show)
    
      val postResult = client.execute(aggPost).await
    
      if (postResult.isSuccess) {
        postResult.result.hits.hits.foreach(h => println(s"${h.sourceAsMap}"))
        postResult.result.aggregations.terms("colors").buckets
          .foreach(b => println(s"${b.key},${b.docCount}"))
      } else println(s"error: ${postResult.error.causedBy.getOrElse("unknown")}")
    
    ...
    
    POST:/cartxns/_search?
    StringEntity({"query":{"match":{"make.keyword":{"query":"ford"}}},"post_filter":{"match":{"color.keyword":{"query":"blue"}}},"aggs":{"colors":{"terms":{"field":"color.keyword"}}}},Some(application/json))
    Map(price -> 25000, color -> blue, make -> ford, sold -> 2014-02-12)
    blue,1
    green,1
  • 相关阅读:
    Git常用命令清单笔记
    MySQL sql语句获取当前日期|时间|时间戳
    mysql-5.7.17.msi安装
    mysql sql语句大全
    解决Springboot集成ActivitiModel提示输入用户名密码的问题
    JAVA Spring MVC中各个filter的用法
    Spring Boot gradle 集成servlet/jsp 教程及示例
    RequireJS 参考文章
    Javascript模块化工具require.js教程
    Javascript模块化编程之路——(require.js)
  • 原文地址:https://www.cnblogs.com/tiger-xc/p/12902536.html
Copyright © 2011-2022 走看看