  • java流stream中的Collectors中groupingBy源码笔记

         * Returns a {@code Collector} implementing a cascaded "group by" operation
         * on input elements of type {@code T}, grouping elements according to a
         * classification function, and then performing a reduction operation on
         * the values associated with a given key using the specified downstream
         * {@code Collector}.  The {@code Map} produced by the Collector is created
         * with the supplied factory function.
         * <p>The classification function maps elements to some key type {@code K}.
         * The downstream collector operates on elements of type {@code T} and
         * produces a result of type {@code D}. The resulting collector produces a
         * {@code Map<K, D>}.
         * <p>For example, to compute the set of last names of people in each city,
         * where the city names are sorted:
         * <pre>{@code
         * Map<City, Set<String>> namesByCity
         *   = people.stream().collect(
         *     groupingBy(Person::getCity,对应分类函数classifier
         *                TreeMap::new,对应结果容器工厂mapFactory
         *                mapping(Person::getLastName, 对应下游收集器downstream
         *                        toSet())));
         * }</pre>
         * @implNote
         * The returned {@code Collector} is not concurrent.  For parallel stream
         * pipelines, the {@code combiner} function operates by merging the keys
         * from one map into another, which can be an expensive operation.  If
         * preservation of the order in which elements are presented to the downstream
         * collector is not required, using {@link #groupingByConcurrent(Function, Supplier, Collector)}
         * may offer better parallel performance.
         如果不需要保持元素在流中的顺序,推荐使用groupingByConcurrent,这可能要比使用parallel stream的性能更好。
         * @param <T> the type of the input elements T:输入元素的类型
         * @param <K> the type of the keys K:结果map中的key类型。
         * @param <A> the intermediate accumulation type of the downstream collector
         * @param <D> the result type of the downstream reduction
         * @param <M> the type of the resulting {@code Map}
         * @param classifier a classifier function mapping input elements to keys
         * @param downstream a {@code Collector} implementing the downstream reduction
         * @param mapFactory a supplier providing a new empty {@code Map}
         *                   into which the results will be inserted
         * @return a {@code Collector} implementing the cascaded group-by operation
         * @see #groupingBy(Function, Collector)
         * @see #groupingBy(Function)
         * @see #groupingByConcurrent(Function, Supplier, Collector)
         A: 下游收集器的累加器的容器类型(累加器的第一个参数)。
         D: 下游收集器的结果类型。当下游收集器没有finisher的时候,A和D是直接相等的。A强转为D。
         M: 最终结果类型,即Map<K,D>
        public static <T, K, D, A, M extends Map<K, D>> //注意这里有5个参数类型
        Collector<T, ?, M> groupingBy(Function<? super T, ? extends K> classifier,
                                      Supplier<M> mapFactory,
                                      Collector<? super T, A, D> downstream) {
            Supplier<A> downstreamSupplier = downstream.supplier();
            BiConsumer<A, ? super T> downstreamAccumulator = downstream.accumulator();
            BiConsumer<Map<K, A>, T> accumulator = (m, t) -> {
                // 根据分类器得到的值,最为最终map中的键
                K key = Objects.requireNonNull(classifier.apply(t), "element cannot be mapped to a null key");
                // 得到一下游收集器的生产者生产的容器,最为最终map中的值。
                A container = m.computeIfAbsent(key, k -> downstreamSupplier.get());
                // 消费这两个参数,进行累加操作,相当于修改了下游收集器的收集过程,让其成为最终收集器的累加器,累积出最终收集器需要的中间结果。
                downstreamAccumulator.accept(container, t);
            // 传入下游收集器的合并器,得到一个新的合并器,合并器合出来的值是经过改造的累加器的结果,所以是合出的最终类型Map<K,A>
            BinaryOperator<Map<K, A>> merger = Collectors.<K, A, Map<K, A>>mapMerger(downstream.combiner());
            // 将Map<K, D>类型的mapFactory强转为Map<K, A>类型,这其中包含了A到D的强转。
            Supplier<Map<K, A>> mangledFactory = (Supplier<Map<K, A>>) mapFactory;
            // 如果集合特性包含IDENTITY_FINISH,说明下游收集器的中间结果就是最终结果,不用再处理finisher
            if (downstream.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)) {
                return new CollectorImpl<>(mangledFactory, accumulator, merger, CH_ID);
            } else {
                Function<A, A> downstreamFinisher = (Function<A, A>) downstream.finisher();
                // 用强转好的finisher,处理所有元素,这时的元素是一个一个的map(前面合并的)
                // intermediate代表一个map,将每个map都用改过的finisher处理一下,得到Map<K,A>类型,再强转一下,将Map<K,A>强转为Map<K,D>
                Function<Map<K, A>, M> finisher = intermediate -> {
                    // 这里replace的只是value,将value处理成A类型
                    intermediate.replaceAll((k, v) -> downstreamFinisher.apply(v));
                    M castResult = (M) intermediate;
                    return castResult;
                return new CollectorImpl<>(mangledFactory, accumulator, merger, finisher, CH_NOID);
         * {@code BinaryOperator<Map>} that merges the contents of its right
         * argument into its left argument, using the provided merge function to
         * handle duplicate keys.
         * @param <K> type of the map keys
         * @param <V> type of the map values
         * @param <M> type of the map
         * @param mergeFunction A merge function suitable for
         * {@link Map#merge(Object, Object, BiFunction) Map.merge()}
         * @return a merge function for two maps
        private static <K, V, M extends Map<K,V>>
        BinaryOperator<M> mapMerger(BinaryOperator<V> mergeFunction) {
            return (m1, m2) -> {
                for (Map.Entry<K,V> e : m2.entrySet())
                    // 如果左边的map里面,左边map中没有右边合过来的key对应的值,就用右边合过来的值,
                    // ,如果有值,就使用合并器算出来的值,确保不冲突。
                    m1.merge(e.getKey(), e.getValue(), mergeFunction);
                return m1;
