zoukankan      html  css  js  c++  java
  • ES系列十四、ES聚合分析(聚合分析简介、指标聚合、桶聚合)

    一、聚合分析简介

    1. ES聚合分析是什么?

    聚合分析是数据库中重要的功能特性,完成对一个查询的数据集中数据的聚合计算,如:找出某字段(或计算表达式的结果)的最大值、最小值,计算和、平均值等。ES作为搜索引擎兼数据库,同样提供了强大的聚合分析能力。

    对一个数据集求最大、最小、和、平均值等指标的聚合,在ES中称为指标聚合   metric

    而关系型数据库中除了有聚合函数外,还可以对查询出的数据进行分组group by,再在组上进行指标聚合。在 ES 中group by 称为分桶桶聚合 bucketing

    ES中还提供了矩阵聚合(matrix)、管道聚合(pipleline),但还在完善中。 

    2. ES聚合分析查询的写法

    在查询请求体中以aggregations节点按如下语法定义聚合分析:

    "aggregations" : {
        "<aggregation_name>" : { <!--聚合的名字 -->
            "<aggregation_type>" : { <!--聚合的类型 -->
                <aggregation_body> <!--聚合体:对哪些字段进行聚合 -->
            }
            [,"meta" : {  [<meta_data_body>] } ]? <!--元 -->
            [,"aggregations" : { [<sub_aggregation>]+ } ]? <!--在聚合里面在定义子聚合 -->
        }
        [,"<aggregation_name_2>" : { ... } ]*<!--聚合的名字 -->
    }

     说明:

    aggregations 也可简写为 aggs

    3. 聚合分析的值来源

    聚合计算的值可以取字段的值,也可是脚本计算的结果

    二、指标聚合

    1. max min sum avg

    示例1:查询所有记录中年龄的最大值

    POST /book1/_search?pretty
    
    {
      "size": 0, 
      "aggs": {
        "maxage": {
          "max": {
            "field": "age"
          }
        }
      }
    }

    结果1:

    {
        "took": 4,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "maxage": {
                "value": 54
            }
        }
    }

    示例2:加上查询条件,查询名字包含'test'的年龄最大值:

    POST /book1/_search?pretty
    
    {
      "query":{
         "term":{
             "name":"test"
         }    
      },
      "size": 2, 
      "sort": [
        {
          "age": {
            "order": "desc"
          }
        }
      ],
      "aggs": {
        "maxage": {
          "max": {
            "field": "age"
          }
        }
      }
    }

    结果2:

    {
        "took": 3,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 5,
            "max_score": null,
            "hits": [
                {
                    "_index": "book1",
                    "_type": "english",
                    "_id": "6IUkUmUBRzBxBrDgFok2",
                    "_score": null,
                    "_source": {
                        "name": "test goog my money",
                        "age": [
                            14,
                            54,
                            45,
                            34
                        ],
                        "class": "dsfdsf",
                        "addr": "中国"
                    },
                    "sort": [
                        54
                    ]
                },
                {
                    "_index": "book1",
                    "_type": "english",
                    "_id": "54UiUmUBRzBxBrDgfIl9",
                    "_score": null,
                    "_source": {
                        "name": "test goog my money",
                        "age": [
                            11,
                            13,
                            14
                        ],
                        "class": "dsfdsf",
                        "addr": "中国"
                    },
                    "sort": [
                        14
                    ]
                }
            ]
        },
        "aggregations": {
            "maxage": {
                "value": 54
            }
        }
    }

     示例3:值来源于脚本,查询所有记录的平均年龄是多少,并对平均年龄加10

    POST /book1/_search?pretty
    {
      "size":0,
      "aggs": {
        "avg_age": {
          "avg": {
            "script": {
              "source": "doc.age.value"
            }
          }
        },
        "avg_age10": {
          "avg": {
            "script": {
              "source": "doc.age.value + 10"
            }
          }
        }
      }
    }

    结果3:

    {
        "took": 3,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "avg_age": {
                "value": 7.585365853658536
            },
            "avg_age10": {
                "value": 17.585365853658537
            }
        }
    }

     示例4:指定field,在脚本中用_value 取字段的值

    POST  /book1/_search?pretty
    {
      "size":0,
      "aggs": {
        "sun_age": {
          "sum": {
              "field":"age",
            "script": {
              "source": "_value * 2"
            }
          }
        }
      }
    }

    结果4:

    {
        "took": 4,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "sun_age": {
                "value": 942
            }
        }
    }

    示例5:为没有值字段指定值。如未指定,缺失该字段值的文档将被忽略:

    POST /book1/_search?pretty
    
    {
      "size":0,
      "aggs": {
        "sun_age": {
          "avg": {
              "field":"age",
            "missing":15
          }
        }
      }
    }

    结果5:

    {
        "took": 12,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "sun_age": {
                "value": 12.847826086956522
            }
        }
    }

    2. 文档计数 count

     示例1:统计银行索引book下年龄为12的文档数量

    POST book1/english/_count
    {
        "query":{
            "match":{
                "age":12
            }
        }
    }

    结果1:

    {
        "count": 16,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        }
    }

    3. Value count 统计某字段有值的文档数

    示例1:

    POST /book1/_search?size=0
    {
        "aggs":{
            "age_count":{
                "value_count":{
                    "field":"age"
                }
                
            }
        }
    }

    结果1:

    {
        "took": 1,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_count": {
                "value": 38
            }
        }
    }

    4. cardinality  值去重计数

    示例1:

    POST  /book1/_search?size=0
    {
        "aggs":{
            "age_count":{
                "value_count":{
                    "field":"age"
                }
                
            },
            "name_count":{
                "cardinality":{
                    "field":"age"
                }
            }
        }
    }

    结果1:

    {
        "took": 16,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "name_count": {
                "value": 11
            },
            "age_count": {
                "value": 38
            }
        }
    }

    说明:有值的38个,去掉重复的之后以一共有11个。

    5. stats 统计 count max min avg sum 5个值

    示例1:

    POST  /book1/_search?size=0
    {
        "aggs":{
            "age_count":{
                "stats":{
                    "field":"age"
                }
                
            }
        }
    }

    结果1:

    {
        "took": 12,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_count": {
                "count": 38,
                "min": 1,
                "max": 54,
                "avg": 12.394736842105264,
                "sum": 471
            }
        }
    }

    6. Extended stats

    高级统计,比stats多4个统计结果: 平方和、方差、标准差、平均值加/减两个标准差的区间。

    示例1:

     

    POST /book1/_search?size=0
    
    {
        "aggs":{
            "age_stats":{
                "extended_stats":{
                    "field":"age"
                }
                
            }
        }
    }

     

    结果1:

    {
        "took": 8,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_stats": {
                "count": 38,
                "min": 1,
                "max": 54,
                "avg": 12.394736842105264,
                "sum": 471,
                "sum_of_squares": 11049,
                "variance": 137.13365650969527,
                "std_deviation": 11.710408041981085,
                "std_deviation_bounds": {
                    "upper": 35.81555292606743,
                    "lower": -11.026079241856905
                }
            }
        }
    }

    7. Percentiles 占比百分位对应的值统计

    示例1:

    对指定字段(脚本)的值按从小到大累计每个值对应的文档数的占比(占所有命中文档数的百分比),返回指定占比比例对应的值。默认返回[ 1, 5, 25, 50, 75, 95, 99 ]分位上的值。如下中间的结果,可以理解为:占比为50%的文档的age值 <= 12,或反过来:age<=12的文档数占总命中文档数的50%。

     

    POST /book1/_search?size=0
    {
        "aggs":{
            "age_percentiles":{
                "percentiles":{
                    "field":"age"
                }
                
            }
        }
    }

    结果1:

    {
        "took": 16,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_percentiles": {
                "values": {
                    "1.0": 1,
                    "5.0": 1,
                    "25.0": 1,
                    "50.0": 12,
                    "75.0": 13,
                    "95.0": 40.600000000000016,
                    "99.0": 54
                }
            }
        }
    }

     示例2:指定分位值(占比50%,96%,99%的范围值分别是多少

    POST /book1/_search?size=0
    {
        "aggs":{
            "age_percentiles":{
                "percentiles":{
                    "field":"age",
                    "percents" : [50,96,99]
                }
                
            }
        }
    }

    结果2:

    {
        "took": 6,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_percentiles": {
                "values": {
                    "50.0": 12,
                    "96.0": 44.779999999999966,
                    "99.0": 54
                }
            }
        }
    }

    说明:50%的数值<= 12, 96%的数值<= 96%, 99%的数值<= 54

     

    8. Percentiles rank 统计值小于等于指定值的文档占比

    示例1:统计年龄小于25和30的文档的占比,和第7项相反

    POST /book1/_search?size=0
    {
        "aggs":{
            "aggs_perc_rank":{
                "percentile_ranks":{
                    "field":"age",
                    "values" : [12,35]
                }
                
            }
        }
    }

    结果1:

    {
        "took": 8,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "aggs_perc_rank": {
                "values": {
                    "12.0": 71.05263157894737,
                    "35.0": 92.76315789473685
                }
            }
        }
    }

    结果说明:年龄小于12的文档占比为71%,年龄小于35的文档占比为92%,

    9. Geo Bounds aggregation 求文档集中的地理位置坐标点的范围

    参考官网链接:

    https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-geobounds-aggregation.html

    10. Geo Centroid aggregation  求地理位置中心点坐标值

    参考官网链接:

    https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-geocentroid-aggregation.html

    三、桶聚合

     

     

    1. Terms Aggregation  根据字段值项分组聚合 

    示例1:

    POST /book1/_search?size=0
    
    {
        "aggs":{
            "age_terms":{
                "terms":{
                    "field":"age"
                }
            }
        }
    }

    说明:相当于group by age

    结果1:

    {
        "took": 4,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_terms": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 1,
                "buckets": [
                    {
                        "key": 12,
                        "doc_count": 16
                    },
                    {
                        "key": 1,
                        "doc_count": 11
                    },
                    {
                        "key": 13,
                        "doc_count": 2
                    },
                    {
                        "key": 14,
                        "doc_count": 2
                    },
                    {
                        "key": 11,
                        "doc_count": 1
                    },
                    {
                        "key": 16,
                        "doc_count": 1
                    },
                    {
                        "key": 21,
                        "doc_count": 1
                    },
                    {
                        "key": 33,
                        "doc_count": 1
                    },
                    {
                        "key": 34,
                        "doc_count": 1
                    },
                    {
                        "key": 45,
                        "doc_count": 1
                    }
                ]
            }
        }
    }

     结果说明:

    "doc_count_error_upper_bound": 0:文档计数的最大偏差值

    "sum_other_doc_count": 1:未返回的其他文档数,不在桶里的文档数量

    默认情况下返回按文档计数从高到低的前10个分组:


    示例2:sizz可以指定返回多少组数

    POST /book1/_search?size=0
    {
        "aggs":{
            "age_terms":{
                "terms":{
                    "field":"age",
                    "size":5
                }
                
            }
        }
    }

    结果2:

    {
        "took": 4,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_terms": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 6,
                "buckets": [
                    {
                        "key": 12,
                        "doc_count": 16
                    },
                    {
                        "key": 1,
                        "doc_count": 11
                    },
                    {
                        "key": 13,
                        "doc_count": 2
                    },
                    {
                        "key": 14,
                        "doc_count": 2
                    },
                    {
                        "key": 11,
                        "doc_count": 1
                    }
                ]
            }
        }
    }

     示例3:每个分组上显示偏差值

    POST /book1/_search?size=0
    {
        "aggs":{
            "age_terms":{
                "terms":{
                    "field":"age",
                    "size":5,
                     "show_term_doc_count_error": true
                }
                
            }
        }
    }

    结果3:

    {
        "took": 5,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_terms": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 6,
                "buckets": [
                    {
                        "key": 12,
                        "doc_count": 16,
                        "doc_count_error_upper_bound": 0
                    },
                    {
                        "key": 1,
                        "doc_count": 11,
                        "doc_count_error_upper_bound": 0
                    },
                    {
                        "key": 13,
                        "doc_count": 2,
                        "doc_count_error_upper_bound": 0
                    },
                    {
                        "key": 14,
                        "doc_count": 2,
                        "doc_count_error_upper_bound": 0
                    },
                    {
                        "key": 11,
                        "doc_count": 1,
                        "doc_count_error_upper_bound": 0
                    }
                ]
            }
        }
    }

    示例4:shard_size 指定每个分片上返回多少个分组

    POST /book1/_search?size=0
    {
        "aggs":{
            "age_terms":{
                "terms":{
                    "field":"age",
                    "size":3,
                     "shard_size": 20
                }
                
            }
        }
    }

    结果4:

    {
        "took": 3,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_terms": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 9,
                "buckets": [
                    {
                        "key": 12,
                        "doc_count": 16
                    },
                    {
                        "key": 1,
                        "doc_count": 11
                    },
                    {
                        "key": 13,
                        "doc_count": 2
                    }
                ]
            }
        }
    }

    order  指定分组的排序

    示例5:根据分组值"_key"排序

    POST /book1/_search?size=0
    {
        "aggs":{
            "age_terms":{
                "terms":{
                    "field":"age",
                    "size":3,
                     "order":{"_key":"desc"}
                }
                
            }
        }
    }

    结果5:

    {
        "took": 6,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_terms": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 35,
                "buckets": [
                    {
                        "key": 54,
                        "doc_count": 1
                    },
                    {
                        "key": 45,
                        "doc_count": 1
                    },
                    {
                        "key": 34,
                        "doc_count": 1
                    }
                ]
            }
        }
    }

     示例6:根据文档计数"_count"排序

    POST /book1/_search?size=0
    {
        "aggs":{
            "age_terms":{
                "terms":{
                    "field":"age",
                    "size":3,
                     "order":{"_count":"desc"}
                }
                
            }
        }
    }

    结果6:

    {
        "took": 91,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_terms": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 9,
                "buckets": [
                    {
                        "key": 12,
                        "doc_count": 16
                    },
                    {
                        "key": 1,
                        "doc_count": 11
                    },
                    {
                        "key": 13,
                        "doc_count": 2
                    }
                ]
            }
        }
    }

    示例7:取分组指标值排序

    POST /book1/_search?size=0
    {
        "aggs":{
            "age_terms":{
                "terms":{
                    "field":"age",
                     "order":{"max_age":"desc"}
                },
                "aggs":{
                    "max_age":{
                        "max":{
                            "field":"age"
                        }
                    },
                    "min_age":{
                        "min":{
                            "field":"age"
                        }
                    }
                }
                
            }
        
            
        }
    }

    说明:先根据age 分组,再计算每个组的最大最小值,最后根据最大值倒排

     示例8:筛选分组-正则表达式匹配值

    POST book1/_search?size=0
    {
        "aggs":{
            "tags":{
                "terms":{
                    "field":"name",
                    "include":"里*",
                    "exclude":"test*"
                }
                
            }
        
            
        }
    }

    结果8:

    {
        "took": 22,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "tags": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 0,
                "buckets": [
                    {
                        "key": "",
                        "doc_count": 13
                    }
                ]
            }
        }
    }

     示例9:筛选分组-指定值列表

    POST book1/_search?size=0
    {
        "aggs":{
            "Chinese":{
                "terms":{
                    "field":"name",
                    "include":["",""]
                }
                
            },
            "Test":{
                "terms":{
                    "field":"name",
                    "exclude":["test","the"]
                }
            }
        
            
        }
    }

    结果9:

    {
        "took": 23,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "Test": {
                "doc_count_error_upper_bound": 6,
                "sum_other_doc_count": 559,
                "buckets": [
                    {
                        "key": "",
                        "doc_count": 12
                    },
                    {
                        "key": "",
                        "doc_count": 11
                    },
                    {
                        "key": "a",
                        "doc_count": 7
                    },
                    {
                        "key": "default",
                        "doc_count": 7
                    },
                    {
                        "key": "document",
                        "doc_count": 7
                    },
                    {
                        "key": "for",
                        "doc_count": 7
                    },
                    {
                        "key": "absolute",
                        "doc_count": 6
                    },
                    {
                        "key": "account",
                        "doc_count": 6
                    },
                    {
                        "key": "accurate",
                        "doc_count": 6
                    },
                    {
                        "key": "documents",
                        "doc_count": 6
                    }
                ]
            },
            "Chinese": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 0,
                "buckets": [
                    {
                        "key": "",
                        "doc_count": 4
                    }
                ]
            }
        }
    }
    View Code

    示例10:根据脚本计算值分组

    POST book1/_search?size=0
    {
        "aggs":{
            "name":{
                "terms":{
                    "script":{
                        "source":"doc['age'].value + doc.age.value",
                        "lang": "painless"
                    }
                }
             }   
         }
    }

    说明:脚本取值的方式doc['age'].value 或者 doc.age.value

     结果10:

    {
        "took": 18,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "name": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 0,
                "buckets": [
                    {
                        "key": "24",
                        "doc_count": 16
                    },
                    {
                        "key": "2",
                        "doc_count": 11
                    },
                    {
                        "key": "0",
                        "doc_count": 8
                    },
                    {
                        "key": "22",
                        "doc_count": 1
                    },
                    {
                        "key": "26",
                        "doc_count": 1
                    },
                    {
                        "key": "28",
                        "doc_count": 1
                    },
                    {
                        "key": "32",
                        "doc_count": 1
                    },
                    {
                        "key": "42",
                        "doc_count": 1
                    },
                    {
                        "key": "66",
                        "doc_count": 1
                    }
                ]
            }
        }
    }

    2.  filter Aggregation  对满足过滤查询的文档进行聚合计算

    示例1:在查询命中的文档中选取符合过滤条件的文档进行聚合,先过滤再聚合(和上面的示例9示例9:筛选分组,区分开:先聚合再过滤)

    POST book1/_search?size=0
    {
        "aggs":{
            "age_terms":{
                "filter":{
                    "match":{"name":"test"}
                },
            "aggs":{
                "avg_age":{
                    "avg":{"field":"age" }
                }
             }
           }
        }
    }

    结果1:

    {
        "took": 152,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_terms": {
                "doc_count": 5,
                "avg_age": {
                    "value": 19.9
                }
            }
        }
    }

    3. Filters Aggregation  多个过滤组聚合计算

    示例1:分别统计包含‘test’,和‘里’的文档的个数

    POST book1/_search?size=0
    {
        "aggs":{
            "age_terms":{
                "filters":{
                    "filters":{
                        "test":{
                            "match":{"name":"test"}
                        },
                        "china":{
                            "match":{"name":""}
                        }    
                    }
                }
            }
        }
    }

    结果:

    {
        "took": 3,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_terms": {
                "buckets": {
                    "china": {
                        "doc_count": 13
                    },
                    "test": {
                        "doc_count": 5
                    }
                }
            }
        }
    }

    例如:日志中选出 error和warning日志的个数,作日志预警

    GET logs/_search
    {
      "size": 0,
      "aggs": {
        "messages": {
          "filters": {
            "filters": {
              "errors": {
                "match": {
                  "body": "error"
                }
              },
              "warnings": {
                "match": {
                  "body": "warning"
                }
              }
            }
          }
        }
      }
    }

    示例2:为其他值组指定key

    POST book1/_search?size=0
    {
        "aggs":{
            "age_terms":{
                "filters":{
                    "other_bucket_key": "other_messages",
                    "filters":{
                        "test":{
                            "match":{"name":"test"}
                        },
                        "china":{
                            "match":{"name":""}
                        }    
                    }
                }
            }
        }
    }

    结果2:

    {
        "took": 9,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_terms": {
                "buckets": {
                    "china": {
                        "doc_count": 13
                    },
                    "test": {
                        "doc_count": 5
                    },
                    "other_messages": {
                        "doc_count": 23
                    }
                }
            }
        }
    }

     4. Range Aggregation 范围分组聚合

    示例1:

    POST book1/_search?size=0
    
    {
        "aggs":{
            "age_range":{
                "range":{
                    "field":"age",
                    "keyed":true,
                    "ranges":[
                        {
                            "to":20,
                            "key":"TW"
                        },
                        {
                            "from":25,
                            "to":40,
                            "key":"TH"
                        },
                        {
                            "from":60,
                            "key":"SIX"
                        }
                    ]
                }
            }
        }
    }

    结果1:

    {
        "took": 3,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "age_range": {
                "buckets": {
                    "TW": {
                        "to": 20,
                        "doc_count": 31
                    },
                    "TH": {
                        "from": 25,
                        "to": 40,
                        "doc_count": 2
                    },
                    "SIX": {
                        "from": 60,
                        "doc_count": 0
                    }
                }
            }
        }
    }

    5. Date Range Aggregation  时间范围分组聚合

    示例1:

    POST /bank/_search?size=0
    {
      "aggs": {
        "range": {
          "date_range": {
            "field": "date",
            "format": "MM-yyy",
            "ranges": [
              {
                "to": "now-10M/M"
              },
              {
                "from": "now-10M/M"
              }
            ]
          }
        }
      }
    }

    结果1:

    {
      "took": 115,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1000,
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "range": {
          "buckets": [
            {
              "key": "*-2017-08-01T00:00:00.000Z",
              "to": 1501545600000,
              "to_as_string": "2017-08-01T00:00:00.000Z",
              "doc_count": 0
            },
            {
              "key": "2017-08-01T00:00:00.000Z-*",
              "from": 1501545600000,
              "from_as_string": "2017-08-01T00:00:00.000Z",
              "doc_count": 0
            }
          ]
        }
      }
    }

    6. Date Histogram Aggregation  时间直方图(柱状)聚合

    就是按天、月、年等进行聚合统计。可按 year (1y), quarter (1q), month (1M), week (1w), day (1d), hour (1h), minute (1m), second (1s) 间隔聚合或指定的时间间隔聚合。

    示例1:

    POST /bank/_search?size=0
    {
      "aggs": {
        "sales_over_time": {
          "date_histogram": {
            "field": "date",
            "interval": "month"
          }
        }
      }
    }

    结果1:

    {
      "took": 9,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 1000,
        "max_score": 0,
        "hits": []
      },
      "aggregations": {
        "sales_over_time": {
          "buckets": []
        }
      }
    }

    7. Missing Aggregation  缺失值的桶聚合

    示例:统计没有值的文档的数量

    POST /book/_search?size=0
    {
        "aggs" : {
            "account_without_a_age" : {
                "missing" : { "field" : "age" }
            }
        }
    }

    结果1:

    {
        "took": 10,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "skipped": 0,
            "failed": 0
        },
        "hits": {
            "total": 41,
            "max_score": 0,
            "hits": []
        },
        "aggregations": {
            "account_without_age": {
                "doc_count": 8
            }
        }
    }

    8. Geo Distance Aggregation  地理距离分区聚合

    参考官网链接:

    https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geodistance-aggregation.html

  • 相关阅读:
    平均要取多少个(0,1)中的随机数才能让和超过1
    perl学习笔记
    K-means
    Mysql数据库常用操作整理
    ETL模型设计
    c++ 面试整理
    vim display line number
    inux 下的/etc/profile、/etc/bashrc、~/.bash_profile、~/.bashrc 文件的作用
    Linux命令大总结
    perl learning
  • 原文地址:https://www.cnblogs.com/wangzhuxing/p/9581947.html
Copyright © 2011-2022 走看看