zoukankan      html  css  js  c++  java
  • 011-elasticsearch5.4.3【四】-聚合操作【二】-桶聚合【bucket】过滤、嵌套、反转、分组、排序、范围

    一、概述

      bucketing(桶)聚合:划分不同的“桶”,将数据分配到不同的“桶”里。非常类似sql中的group语句的含义。

      metric既可以作用在整个数据集上,也可以作为bucketing的子聚合作用在每一个“桶”中的数据集上。当然,我们可以把整个数据集合看做一个大“桶”,所有的数据都分配到这个大“桶”中。

    1.1、Global聚合

    AggregationBuilders
        .global("agg")
        .subAggregation(AggregationBuilders.terms("genders").field("gender"));

    使用

    import org.elasticsearch.search.aggregations.bucket.global.Global;
    // sr is here your SearchResponse object
    Global agg = sr.getAggregations().get("agg");
    agg.getDocCount(); // Doc count

    1.2、过滤聚合

    AggregationBuilders
        .filter("agg", QueryBuilders.termQuery("gender", "male"));

    使用

    import org.elasticsearch.search.aggregations.bucket.filter.Filter;
    // sr is here your SearchResponse object
    Filter agg = sr.getAggregations().get("agg");
    agg.getDocCount(); // Doc count

    1.3、多过滤聚合【类似分组聚合,只是筛选出关注的】

    AggregationBuilder aggregation =
        AggregationBuilders
            .filters("agg",
                new FiltersAggregator.KeyedFilter("men", QueryBuilders.termQuery("gender", "male")),
                new FiltersAggregator.KeyedFilter("women", QueryBuilders.termQuery("gender", "female")));

    使用

    import org.elasticsearch.search.aggregations.bucket.filters.Filters;
    // sr is here your SearchResponse object
    Filters agg = sr.getAggregations().get("agg");
    
    // For each entry
    for (Filters.Bucket entry : agg.getBuckets()) {
        String key = entry.getKeyAsString();            // bucket key
        long docCount = entry.getDocCount();            // Doc count
        logger.info("key [{}], doc_count [{}]", key, docCount);
    }

    结果

    key [men], doc_count [4982]
    key [women], doc_count [5018]

    1.4、MIssing 聚合

    AggregationBuilders.missing("agg").field("gender");

    使用

    import org.elasticsearch.search.aggregations.bucket.missing.Missing;
    // sr is here your SearchResponse object
    Missing agg = sr.getAggregations().get("agg");
    agg.getDocCount(); // Doc count

    1.5、嵌套

    AggregationBuilders.nested("agg", "resellers");

    使用

    import org.elasticsearch.search.aggregations.bucket.nested.Nested;
    // sr is here your SearchResponse object
    Nested agg = sr.getAggregations().get("agg");
    agg.getDocCount(); // Doc count

    1.6、反转嵌套

    AggregationBuilder aggregation =
        AggregationBuilders
            .nested("agg", "resellers")
            .subAggregation(
                    AggregationBuilders
                            .terms("name").field("resellers.name")
                            .subAggregation(
                                    AggregationBuilders
                                            .reverseNested("reseller_to_product")
                            )
            );

    使用

    import org.elasticsearch.search.aggregations.bucket.nested.Nested;
    import org.elasticsearch.search.aggregations.bucket.nested.ReverseNested;
    import org.elasticsearch.search.aggregations.bucket.terms.Terms;
    // sr is here your SearchResponse object
    Nested agg = sr.getAggregations().get("agg");
    Terms name = agg.getAggregations().get("name");
    for (Terms.Bucket bucket : name.getBuckets()) {
        ReverseNested resellerToProduct = bucket.getAggregations().get("reseller_to_product");
        resellerToProduct.getDocCount(); // Doc count
    }

    1.7、子聚合

    AggregationBuilder aggregation = AggregationBuilders.children("agg", "reseller");

    使用

    import org.elasticsearch.search.aggregations.bucket.children.Children;
    // sr is here your SearchResponse object
    Children agg = sr.getAggregations().get("agg");
    agg.getDocCount(); // Doc count

    1.8、Terms 聚合【按某个字段分组】

    AggregationBuilders.terms("genders").field("gender");

    使用

    import org.elasticsearch.search.aggregations.bucket.terms.Terms;
    // sr is here your SearchResponse object
    Terms genders = sr.getAggregations().get("genders");
    
    // For each entry
    for (Terms.Bucket entry : genders.getBuckets()) {
        entry.getKey();      // Term
        entry.getDocCount(); // Doc count
    }

    1.9、排序【Order】

    通过doc_count以递增方式对存储桶进行排序:

    AggregationBuilders
        .terms("genders")
        .field("gender")
        .order(Terms.Order.count(true))

    按字母顺序按顺序升序方式排序存储桶:

    AggregationBuilders
        .terms("genders")
        .field("gender")
        .order(Terms.Order.term(true))

    通过单值度量子聚合(由聚合名称标识)对存储桶进行排序:

    AggregationBuilders
        .terms("genders")
        .field("gender")
        .order(Terms.Order.aggregation("avg_height", false))
        .subAggregation(
            AggregationBuilders.avg("avg_height").field("height")
        )

    1.10、范围聚合

    AggregationBuilder aggregation =
            AggregationBuilders
                    .range("agg")
                    .field("height")
                    .addUnboundedTo(1.0f)               // from -infinity to 1.0 (excluded)
                    .addRange(1.0f, 1.5f)               // from 1.0 to 1.5 (excluded)
                    .addUnboundedFrom(1.5f);            // from 1.5 to +infinity

    使用

    import org.elasticsearch.search.aggregations.bucket.range.Range;
    // sr is here your SearchResponse object
    Range agg = sr.getAggregations().get("agg");
    
    // For each entry
    for (Range.Bucket entry : agg.getBuckets()) {
        String key = entry.getKeyAsString();             // Range as key
        Number from = (Number) entry.getFrom();          // Bucket from
        Number to = (Number) entry.getTo();              // Bucket to
        long docCount = entry.getDocCount();    // Doc count
    
        logger.info("key [{}], from [{}], to [{}], doc_count [{}]", key, from, to, docCount);
    }

    结果

    key [*-1.0], from [-Infinity], to [1.0], doc_count [9]
    key [1.0-1.5], from [1.0], to [1.5], doc_count [21]
    key [1.5-*], from [1.5], to [Infinity], doc_count [20]

    1.11、日期范围聚合

    AggregationBuilder aggregation =
            AggregationBuilders
                    .dateRange("agg")
                    .field("dateOfBirth")
                    .format("yyyy")
                    .addUnboundedTo("1950")    // from -infinity to 1950 (excluded)
                    .addRange("1950", "1960")  // from 1950 to 1960 (excluded)
                    .addUnboundedFrom("1960"); // from 1960 to +infinity

    使用

    import org.elasticsearch.search.aggregations.bucket.range.Range;
    // sr is here your SearchResponse object
    Range agg = sr.getAggregations().get("agg");
    
    // For each entry
    for (Range.Bucket entry : agg.getBuckets()) {
        String key = entry.getKeyAsString();                // Date range as key
        DateTime fromAsDate = (DateTime) entry.getFrom();   // Date bucket from as a Date
        DateTime toAsDate = (DateTime) entry.getTo();       // Date bucket to as a Date
        long docCount = entry.getDocCount();                // Doc count
    
        logger.info("key [{}], from [{}], to [{}], doc_count [{}]", key, fromAsDate, toAsDate, docCount);
    }

    结果

    key [*-1950], from [null], to [1950-01-01T00:00:00.000Z], doc_count [8]
    key [1950-1960], from [1950-01-01T00:00:00.000Z], to [1960-01-01T00:00:00.000Z], doc_count [5]
    key [1960-*], from [1960-01-01T00:00:00.000Z], to [null], doc_count [37]

    更多,如significantTerms、IP范围聚合、直方图聚合、日期直方图聚合、GEO距离聚合等地址

  • 相关阅读:
    【今日CV 视觉论文速览】 19 Nov 2018
    【numpy求和】numpy.sum()求和
    【今日CV 视觉论文速览】16 Nov 2018
    【今日CV 视觉论文速览】15 Nov 2018
    poj 2454 Jersey Politics 随机化
    poj 3318 Matrix Multiplication 随机化算法
    hdu 3400 Line belt 三分法
    poj 3301 Texas Trip 三分法
    poj 2976 Dropping tests 0/1分数规划
    poj 3440 Coin Toss 概率问题
  • 原文地址:https://www.cnblogs.com/bjlhx/p/8514127.html
Copyright © 2011-2022 走看看