zoukankan      html  css  js  c++  java
  • 011-elasticsearch5.4.3【四】-聚合操作【二】-桶聚合【bucket】过滤、嵌套、反转、分组、排序、范围

    一、概述

      bucketing(桶)聚合:划分不同的“桶”,将数据分配到不同的“桶”里。非常类似sql中的group语句的含义。

      metric既可以作用在整个数据集上,也可以作为bucketing的子聚合作用在每一个“桶”中的数据集上。当然,我们可以把整个数据集合看做一个大“桶”,所有的数据都分配到这个大“桶”中。

    1.1、Global聚合

    AggregationBuilders
        .global("agg")
        .subAggregation(AggregationBuilders.terms("genders").field("gender"));

    使用

    import org.elasticsearch.search.aggregations.bucket.global.Global;
    // sr is here your SearchResponse object
    Global agg = sr.getAggregations().get("agg");
    agg.getDocCount(); // Doc count

    1.2、过滤聚合

    AggregationBuilders
        .filter("agg", QueryBuilders.termQuery("gender", "male"));

    使用

    import org.elasticsearch.search.aggregations.bucket.filter.Filter;
    // sr is here your SearchResponse object
    Filter agg = sr.getAggregations().get("agg");
    agg.getDocCount(); // Doc count

    1.3、多过滤聚合【类似分组聚合,只是筛选出关注的】

    AggregationBuilder aggregation =
        AggregationBuilders
            .filters("agg",
                new FiltersAggregator.KeyedFilter("men", QueryBuilders.termQuery("gender", "male")),
                new FiltersAggregator.KeyedFilter("women", QueryBuilders.termQuery("gender", "female")));

    使用

    import org.elasticsearch.search.aggregations.bucket.filters.Filters;
    // sr is here your SearchResponse object
    Filters agg = sr.getAggregations().get("agg");
    
    // For each entry
    for (Filters.Bucket entry : agg.getBuckets()) {
        String key = entry.getKeyAsString();            // bucket key
        long docCount = entry.getDocCount();            // Doc count
        logger.info("key [{}], doc_count [{}]", key, docCount);
    }

    结果

    key [men], doc_count [4982]
    key [women], doc_count [5018]

    1.4、MIssing 聚合

    AggregationBuilders.missing("agg").field("gender");

    使用

    import org.elasticsearch.search.aggregations.bucket.missing.Missing;
    // sr is here your SearchResponse object
    Missing agg = sr.getAggregations().get("agg");
    agg.getDocCount(); // Doc count

    1.5、嵌套

    AggregationBuilders.nested("agg", "resellers");

    使用

    import org.elasticsearch.search.aggregations.bucket.nested.Nested;
    // sr is here your SearchResponse object
    Nested agg = sr.getAggregations().get("agg");
    agg.getDocCount(); // Doc count

    1.6、反转嵌套

    AggregationBuilder aggregation =
        AggregationBuilders
            .nested("agg", "resellers")
            .subAggregation(
                    AggregationBuilders
                            .terms("name").field("resellers.name")
                            .subAggregation(
                                    AggregationBuilders
                                            .reverseNested("reseller_to_product")
                            )
            );

    使用

    import org.elasticsearch.search.aggregations.bucket.nested.Nested;
    import org.elasticsearch.search.aggregations.bucket.nested.ReverseNested;
    import org.elasticsearch.search.aggregations.bucket.terms.Terms;
    // sr is here your SearchResponse object
    Nested agg = sr.getAggregations().get("agg");
    Terms name = agg.getAggregations().get("name");
    for (Terms.Bucket bucket : name.getBuckets()) {
        ReverseNested resellerToProduct = bucket.getAggregations().get("reseller_to_product");
        resellerToProduct.getDocCount(); // Doc count
    }

    1.7、子聚合

    AggregationBuilder aggregation = AggregationBuilders.children("agg", "reseller");

    使用

    import org.elasticsearch.search.aggregations.bucket.children.Children;
    // sr is here your SearchResponse object
    Children agg = sr.getAggregations().get("agg");
    agg.getDocCount(); // Doc count

    1.8、Terms 聚合【按某个字段分组】

    AggregationBuilders.terms("genders").field("gender");

    使用

    import org.elasticsearch.search.aggregations.bucket.terms.Terms;
    // sr is here your SearchResponse object
    Terms genders = sr.getAggregations().get("genders");
    
    // For each entry
    for (Terms.Bucket entry : genders.getBuckets()) {
        entry.getKey();      // Term
        entry.getDocCount(); // Doc count
    }

    1.9、排序【Order】

    通过doc_count以递增方式对存储桶进行排序:

    AggregationBuilders
        .terms("genders")
        .field("gender")
        .order(Terms.Order.count(true))

    按字母顺序按顺序升序方式排序存储桶:

    AggregationBuilders
        .terms("genders")
        .field("gender")
        .order(Terms.Order.term(true))

    通过单值度量子聚合(由聚合名称标识)对存储桶进行排序:

    AggregationBuilders
        .terms("genders")
        .field("gender")
        .order(Terms.Order.aggregation("avg_height", false))
        .subAggregation(
            AggregationBuilders.avg("avg_height").field("height")
        )

    1.10、范围聚合

    AggregationBuilder aggregation =
            AggregationBuilders
                    .range("agg")
                    .field("height")
                    .addUnboundedTo(1.0f)               // from -infinity to 1.0 (excluded)
                    .addRange(1.0f, 1.5f)               // from 1.0 to 1.5 (excluded)
                    .addUnboundedFrom(1.5f);            // from 1.5 to +infinity

    使用

    import org.elasticsearch.search.aggregations.bucket.range.Range;
    // sr is here your SearchResponse object
    Range agg = sr.getAggregations().get("agg");
    
    // For each entry
    for (Range.Bucket entry : agg.getBuckets()) {
        String key = entry.getKeyAsString();             // Range as key
        Number from = (Number) entry.getFrom();          // Bucket from
        Number to = (Number) entry.getTo();              // Bucket to
        long docCount = entry.getDocCount();    // Doc count
    
        logger.info("key [{}], from [{}], to [{}], doc_count [{}]", key, from, to, docCount);
    }

    结果

    key [*-1.0], from [-Infinity], to [1.0], doc_count [9]
    key [1.0-1.5], from [1.0], to [1.5], doc_count [21]
    key [1.5-*], from [1.5], to [Infinity], doc_count [20]

    1.11、日期范围聚合

    AggregationBuilder aggregation =
            AggregationBuilders
                    .dateRange("agg")
                    .field("dateOfBirth")
                    .format("yyyy")
                    .addUnboundedTo("1950")    // from -infinity to 1950 (excluded)
                    .addRange("1950", "1960")  // from 1950 to 1960 (excluded)
                    .addUnboundedFrom("1960"); // from 1960 to +infinity

    使用

    import org.elasticsearch.search.aggregations.bucket.range.Range;
    // sr is here your SearchResponse object
    Range agg = sr.getAggregations().get("agg");
    
    // For each entry
    for (Range.Bucket entry : agg.getBuckets()) {
        String key = entry.getKeyAsString();                // Date range as key
        DateTime fromAsDate = (DateTime) entry.getFrom();   // Date bucket from as a Date
        DateTime toAsDate = (DateTime) entry.getTo();       // Date bucket to as a Date
        long docCount = entry.getDocCount();                // Doc count
    
        logger.info("key [{}], from [{}], to [{}], doc_count [{}]", key, fromAsDate, toAsDate, docCount);
    }

    结果

    key [*-1950], from [null], to [1950-01-01T00:00:00.000Z], doc_count [8]
    key [1950-1960], from [1950-01-01T00:00:00.000Z], to [1960-01-01T00:00:00.000Z], doc_count [5]
    key [1960-*], from [1960-01-01T00:00:00.000Z], to [null], doc_count [37]

    更多,如significantTerms、IP范围聚合、直方图聚合、日期直方图聚合、GEO距离聚合等地址

  • 相关阅读:
    并不对劲的辛普森积分
    并不对劲的概率与期望
    并不对劲的cdq分治解三维偏序
    68.机器人的运动范围
    67.矩阵中的路径
    66.滑动窗口最大值
    65.数据流的中位数
    64.二叉搜索树的第K个节点
    63.序列化二叉树
    62.把二叉树打印成多行
  • 原文地址:https://www.cnblogs.com/bjlhx/p/8514127.html
Copyright © 2011-2022 走看看