zoukankan      html  css  js  c++  java
  • JAVA8学习——Stream底层的实现一(学习过程)

    Stream底层的实现

    Stream接口实现了 BaseStream 接口,我们先来看看BaseStream的定义

    BaseStream

    BaseStream是所有流的父类接口。

    对JavaDoc做一次解读,了解提供的所有方法。

    /**
     * Base interface for streams, which are sequences of elements supporting
     * sequential and parallel aggregate operations.  The following example
     * illustrates an aggregate operation using the stream types {@link Stream}
     * and {@link IntStream}, computing the sum of the weights of the red widgets:
     *
     * <pre>{@code
     *     int sum = widgets.stream()
     *                      .filter(w -> w.getColor() == RED)
     *                      .mapToInt(w -> w.getWeight())
     *                      .sum();
     * }</pre>
     *
     * See the class documentation for {@link Stream} and the package documentation
     * for <a href="package-summary.html">java.util.stream</a> for additional
     * specification of streams, stream operations, stream pipelines, and
     * parallelism, which governs the behavior of all stream types.
     
     
     *
     * @param <T> the type of the stream elements
     * @param <S> the type of of the stream implementing {@code BaseStream}
     S 代表中间操作产生的新的流操作。
     
     * @since 1.8
     * @see Stream
     * @see IntStream
     * @see LongStream
     * @see DoubleStream
     * @see <a href="package-summary.html">java.util.stream</a>
     */
    
    public interface BaseStream<T, S extends BaseStream<T, S>>
            extends AutoCloseable {
        /**
         * Returns an iterator for the elements of this stream.
         *
         * <p>This is a <a href="package-summary.html#StreamOps">terminal
         * operation</a>.
         *
         * @return the element iterator for this stream
         */
        Iterator<T> iterator(); //迭代器 ,针对于流中元素类型*(T)的迭代器
    
        /**
         * Returns a spliterator for the elements of this stream.
         *
         * <p>This is a <a href="package-summary.html#StreamOps">terminal
         * operation</a>.
         *
         * @return the element spliterator for this stream
         */
        Spliterator<T> spliterator(); //分割迭代器, 流中的核心的操作。
    
        /**
         * Returns whether this stream, if a terminal operation were to be executed,
         * would execute in parallel.  Calling this method after invoking an
         * terminal stream operation method may yield unpredictable results.
         *
         * @return {@code true} if this stream would execute in parallel if executed
         */
        boolean isParallel(); //是否并行
    
        /**
         * Returns an equivalent stream that is sequential.  May return
         * itself, either because the stream was already sequential, or because
         * the underlying stream state was modified to be sequential.,.
         返回一个等价的串行流,有可能返回流本身,或者是流修改成串行流的
         *
         * <p>This is an <a href="package-summary.html#StreamOps">intermediate
         * operation</a>.
         *
         * @return a sequential stream
         */
        S sequential();   //返回值为S:流,新的流对象
    
        /**
         * Returns an equivalent stream that is parallel.  May return
         * itself, either because the stream was already parallel, or because
         * the underlying stream state was modified to be parallel.
         *
         * <p>This is an <a href="package-summary.html#StreamOps">intermediate
         * operation</a>.
         *
         * @return a parallel stream
         */
        S parallel();
    
        /**
         * Returns an equivalent stream that is
         * <a href="package-summary.html#Ordering">unordered</a>.  May return
         * itself, either because the stream was already unordered, or because
         * the underlying stream state was modified to be unordered.
         *
         * <p>This is an <a href="package-summary.html#StreamOps">intermediate
         * operation</a>.
         *
         * @return an unordered stream
         */
        S unordered();
    
        /**
         * Returns an equivalent stream with an additional close handler.  Close
         * handlers are run when the {@link #close()} method
         * is called on the stream, and are executed in the order they were
         * added.  All close handlers are run, even if earlier close handlers throw
         * exceptions.  If any close handler throws an exception, the first
         * exception thrown will be relayed to the caller of {@code close()}, with
         * any remaining exceptions added to that exception as suppressed exceptions
         * (unless one of the remaining exceptions is the same exception as the
         * first exception, since an exception cannot suppress itself.)  May
         * return itself.
         返回值为流。流中带了一个关闭处理器、关闭处理器调用的是 close()方法。
         按照被添加的顺序去关闭。
         *
         * <p>This is an <a href="package-summary.html#StreamOps">intermediate
         * operation</a>.
         *
         * @param closeHandler A task to execute when the stream is closed
         * @return a stream with a handler that is run if the stream is closed
         */
        S onClose(Runnable closeHandler);
    
        /**
         * Closes this stream, causing all close handlers for this stream pipeline
         * to be called.
         *
         * @see AutoCloseable#close()
         */
        @Override
        void close();
    }
    

    对onClose关闭处理器做单独的认识

    public class StreamTest2 {
        public static void main(String[] args) {
            List<String> list = Arrays.asList("hello", "world", "hello world");
    //        list.stream().onClose(()-> System.out.println("aaa")).onClose(()-> System.out.println("bbb")).forEach(System.out::println);
    
            try (Stream<String> stream = list.stream()){
                stream.onClose(()-> {
                    System.out.println("aaa");
                    throw new NullPointerException("first Exception");
                }).onClose(()->{
                    System.out.println("bbb");
                    throw new ArithmeticException("first Exception");
                }).forEach(System.out::println);
            }
        }
    }
    
    Exception in thread "main" java.lang.NullPointerException: first Exception
    	at com.dawa.jdk8.StreamTest2.lambda$main$0(StreamTest2.java:21)
    	at java.util.stream.Streams$1.run(Streams.java:850)
    	at java.util.stream.AbstractPipeline.close(AbstractPipeline.java:323)
    	at com.dawa.jdk8.StreamTest2.main(StreamTest2.java:26)
    	Suppressed: java.lang.ArithmeticException: first Exception
    		at com.dawa.jdk8.StreamTest2.lambda$main$1(StreamTest2.java:24)
    		at java.util.stream.Streams$1.run(Streams.java:854)
    		... 2 more
    

    几种可能的情况

    1. //运行结果就可以看到 aa,bbb被调用。
    2. //也可以看到压制异常
    3. //如果两个地方的异常是相同的异常对象,则第二个异常不会被压制。因为异常是自己不会压制自己的。
    4. //如果是同一种异常,但是不是同一个异常,还是会压制的。

    备注:遇到问题,javadoc里面已经写的很清楚了。往往每个人都伸手可得的内容,容易最被忽视掉。

    Stream源码分析

    引入Example,跟源码。

    public static void main(String[] args) {
            List<String> list = Arrays.asList("hello", "world", "hello world");
            list.stream().forEach(System.out::println);
        }
    

    1. stream()

    来自Collection接口中的默认方法。

         /**
         * Returns a sequential {@code Stream} with this collection as its source.
         *
         * <p>This method should be overridden when the {@link #spliterator()}
         * method cannot return a spliterator that is {@code IMMUTABLE},
         * {@code CONCURRENT}, or <em>late-binding</em>. (See {@link #spliterator()}
         * for details.)
         当这个 spliterator()无法返回这三个(不可变的,并行的,延迟绑定的)类型中的一个的话,
         这个方法需要被重写。
         *
         * @implSpec
         * The default implementation creates a sequential {@code Stream} from the
         * collection's {@code Spliterator}.
         返回一个针对于当前元素的一个串行流。
         *
         * @return a sequential {@code Stream} over the elements in this collection
         * @since 1.8
         */
    	default Stream<E> stream() {
            return StreamSupport.stream(spliterator(), false);
        }
    

    所以,这里就要先了解一下spliterator()这个方法是怎么实现的。

    spliterator()的源码实现

    实现方法和stream()一样,在Collection接口中的默认方法

    /**
         * Creates a {@link Spliterator} over the elements in this collection.
         *
         * Implementations should document characteristic values reported by the
         * spliterator.  Such characteristic values are not required to be reported
         * if the spliterator reports {@link Spliterator#SIZED} and this collection
         * contains no elements.
         Spliterator#SIZED,集合,固定大小,并且没有值。的时候是不用报告的。
         
         --备注:和collectors的characteristic特性值
         
         *
         * <p>The default implementation should be overridden by subclasses that
         * can return a more efficient spliterator.  In order to
         * preserve expected laziness behavior for the {@link #stream()} and
         * {@link #parallelStream()}} methods, spliterators should either have the
         * characteristic of {@code IMMUTABLE} or {@code CONCURRENT}, or be
         * <em><a href="Spliterator.html#binding">late-binding</a></em>.
         * If none of these is practical, the overriding class should describe the
         * spliterator's documented policy of binding and structural interference,
         * and should override the {@link #stream()} and {@link #parallelStream()}
         * methods to create streams using a {@code Supplier} of the spliterator,
         * as in:
         默认的实现,应该被子类所重写。为了保留期望的stream()的延迟行为。分割迭代器的特性值 只有在满足{@code IMMUTABLE} or {@code CONCURRENT}的时候,才是具有延迟行为的。
         如果上面条件都无法做的话,重写的类应该去描述这个分割迭代器的文档
         并且重写。
         用下面的这种方式去定义。
         
         * <pre>{@code
         *     Stream<E> s = StreamSupport.stream(() -> spliterator(), spliteratorCharacteristics)
         * }</pre>
         * <p>These requirements ensure that streams produced by the
         * {@link #stream()} and {@link #parallelStream()} methods will reflect the
         * contents of the collection as of initiation of the terminal stream
         * operation.
         这些要求确保了由这两个方法生成的流,反应了流的内容 (在终止流操作执行的一瞬间)
         *
         * @implSpec
         * The default implementation creates a
         * <em><a href="Spliterator.html#binding">late-binding</a></em> spliterator
         * from the collections's {@code Iterator}.  The spliterator inherits the
         * <em>fail-fast</em> properties of the collection's iterator.
         默认的实现 从集合的迭代器中,创建一个延迟绑定的分割迭代器。  分割迭代器会继承迭代器的快速失败的属性。
         
         * <p>
         * The created {@code Spliterator} reports {@link Spliterator#SIZED}.
         创建的分割迭代器,会携带一个 Spliterator#SIZED (固定大小的)的特性值
         *
         * @implNote
         * The created {@code Spliterator} additionally reports
         * {@link Spliterator#SUBSIZED}.
         还会额外的增加一个Spliterator#SUBSIZED(子大小)的确定。
         
         *
         * <p>If a spliterator covers no elements then the reporting of additional
         * characteristic values, beyond that of {@code SIZED} and {@code SUBSIZED},
         * does not aid clients to control, specialize or simplify computation.
         * However, this does enable shared use of an immutable and empty
         * spliterator instance (see {@link Spliterators#emptySpliterator()}) for
         * empty collections, and enables clients to determine if such a spliterator
         * covers no elements.
         如果分割迭代器里面没有元素,那么除了 {@code SIZED} and {@code SUBSIZED}之外其他的特性,对于计算的控制是没有帮助作用的。 不过可以促进空的迭代器的共享使用。 参见: Spliterators#emptySpliterator()、
         对于一个空的迭代器可以判断是不是没有元素
         
         *
         * @return a {@code Spliterator} over the elements in this collection
         * @since 1.8
         */
        @Override
        default Spliterator<E> spliterator() {
            return Spliterators.spliterator(this, 0);
        }
    

    那么,到底什么是分割迭代器? Spliterator

    到底什么是分割迭代器 —— Spliterator类

    和Collector收集器一样,同时提供了collector接口和 Collectors的工具类

    public final class Spliterators {}
    public interface Spliterator<T> {}
    

    我们先来看看Spliterator接口的javadoc

    
    /**
     * An object for traversing and partitioning elements of a source.  The source
     * of elements covered by a Spliterator could be, for example, an array, a
     * {@link Collection}, an IO channel, or a generator function.
     一个分割迭代器,是一个对象,用于对源中的元素进行遍历和分区。
     源可以是:数组,集合,或者IO通道
     *
     * <p>A Spliterator may traverse elements individually ({@link
     * #tryAdvance tryAdvance()}) or sequentially in bulk
     * ({@link #forEachRemaining forEachRemaining()}).
     一个迭代器可以一个一个的去遍历。tryAdvance()
     也可以以块的方式去遍历。 forEachRemaining()
     
     *
     * <p>A Spliterator may also partition off some of its elements (using
     * {@link #trySplit}) as another Spliterator, to be used in
     * possibly-parallel operations.  Operations using a Spliterator that
     * cannot split, or does so in a highly imbalanced or inefficient
     * manner, are unlikely to benefit from parallelism.  Traversal
     * and splitting exhaust elements; each Spliterator is useful for only a single
     * bulk computation.
     也可以使用 trySplit() 对元素进行分区,形成一个新的元素迭代器。也可以以并行的方式去操作。
     使用Spliterator的操作,是不能分割,或者效率非常低的分割, 如果用并行的话,不会获得很大的收益。
     (比如,100个元素,分区分为 2+98,这种的就是非常低效的。就无法利用并行的优势了。)
     
     每一个分割迭代器,只对自己特定的块有用。
     *
     * <p>A Spliterator also reports a set of {@link #characteristics()} of its
     * structure, source, and elements from among {@link #ORDERED},
     * {@link #DISTINCT}, {@link #SORTED}, {@link #SIZED}, {@link #NONNULL},
     * {@link #IMMUTABLE}, {@link #CONCURRENT}, and {@link #SUBSIZED}. These may
     * be employed by Spliterator clients to control, specialize or simplify
     * computation.  For example, a Spliterator for a {@link Collection} would
     * report {@code SIZED}, a Spliterator for a {@link Set} would report
     * {@code DISTINCT}, and a Spliterator for a {@link SortedSet} would also
     * report {@code SORTED}.  Characteristics are reported as a simple unioned bit
     * set.
     分割迭代器还会 设置 特性值 
      {@link #ORDERED}, 
      {@link #DISTINCT}, 
      {@link #SORTED}, 
      {@link #SIZED}, 
      {@link #NONNULL},
      {@link #IMMUTABLE}, 
      {@link #CONCURRENT}, 
      {@link #SUBSIZED}.
      这些属性用来控制特定的某些计算。
      
      比如说,一个 Collection就需要SIZED特性值
      Set需要DISTINCT
     
     *
     * Some characteristics additionally constrain method behavior; for example if
     * {@code ORDERED}, traversal methods must conform to their documented ordering.
     * New characteristics may be defined in the future, so implementors should not
     * assign meanings to unlisted values.
     不要给没有列出来的值赋予新的含义。
     
     *
     * <p><a name="binding">A Spliterator that does not report {@code IMMUTABLE} or
     * {@code CONCURRENT} is expected to have a documented policy concerning:
     * when the spliterator <em>binds</em> to the element source; and detection of
     * structural interference of the element source detected after binding.</a>  A
     * <em>late-binding</em> Spliterator binds to the source of elements at the
     * point of first traversal, first split, or first query for estimated size,
     * rather than at the time the Spliterator is created.  A Spliterator that is
     * not <em>late-binding</em> binds to the source of elements at the point of
     * construction or first invocation of any method.  Modifications made to the
     * source prior to binding are reflected when the Spliterator is traversed.
     * After binding a Spliterator should, on a best-effort basis, throw
     * {@link ConcurrentModificationException} if structural interference is
     * detected.  Spliterators that do this are called <em>fail-fast</em>.  The
     * bulk traversal method ({@link #forEachRemaining forEachRemaining()}) of a
     * Spliterator may optimize traversal and check for structural interference
     * after all elements have been traversed, rather than checking per-element and
     * failing immediately.
     并不是说一个迭代器在创建的时候就被绑定到源上面了。而是在满足首次遍历,首次分割,首次查询的时候,才进行绑定。
    
    ConcurrentModificationException,在绑定之前操作,会出现这一行的异常。
    叫做 快速失败。
     
     *
     * <p>Spliterators can provide an estimate of the number of remaining elements
     * via the {@link #estimateSize} method.  Ideally, as reflected in characteristic
     * {@link #SIZED}, this value corresponds exactly to the number of elements
     * that would be encountered in a successful traversal.  However, even when not
     * exactly known, an estimated value value may still be useful to operations
     * being performed on the source, such as helping to determine whether it is
     * preferable to split further or traverse the remaining elements sequentially.
     *
     * <p>Despite their obvious utility in parallel algorithms, spliterators are not
     * expected to be thread-safe; instead, implementations of parallel algorithms
     * using spliterators should ensure that the spliterator is only used by one
     * thread at a time.  This is generally easy to attain via <em>serial
     * thread-confinement</em>, which often is a natural consequence of typical
     * parallel algorithms that work by recursive decomposition.  A thread calling
     * {@link #trySplit()} may hand over the returned Spliterator to another thread,
     * which in turn may traverse or further split that Spliterator.  The behaviour
     * of splitting and traversal is undefined if two or more threads operate
     * concurrently on the same spliterator.  If the original thread hands a
     * spliterator off to another thread for processing, it is best if that handoff
     * occurs before any elements are consumed with {@link #tryAdvance(Consumer)
     * tryAdvance()}, as certain guarantees (such as the accuracy of
     * {@link #estimateSize()} for {@code SIZED} spliterators) are only valid before
     * traversal has begun.
     *
     serial-thread-confinement : 线程安全围栏
     
     * <p>Primitive subtype specializations of {@code Spliterator} are provided for
     * {@link OfInt int}, {@link OfLong long}, and {@link OfDouble double} values.
     * The subtype default implementations of
     * {@link Spliterator#tryAdvance(java.util.function.Consumer)}
     * and {@link Spliterator#forEachRemaining(java.util.function.Consumer)} box
     * primitive values to instances of their corresponding wrapper class.  Such
     * boxing may undermine any performance advantages gained by using the primitive
     * specializations.  To avoid boxing, the corresponding primitive-based methods
     * should be used.  
     tryAdvance()方法和forEachRemaining() 
     提供了原生子类型的特化, int, long, doule 等,子类型默认的实现。
     避免包装类型装箱拆箱操作。
     如下。
     
     For example, 如下特化版本. 
     * {@link Spliterator.OfInt#tryAdvance(java.util.function.IntConsumer)}
     * and {@link Spliterator.OfInt#forEachRemaining(java.util.function.IntConsumer)}
     * should be used in preference to
     * {@link Spliterator.OfInt#tryAdvance(java.util.function.Consumer)} and
     * {@link Spliterator.OfInt#forEachRemaining(java.util.function.Consumer)}.
     * Traversal of primitive values using boxing-based methods
     * {@link #tryAdvance tryAdvance()} and
     * {@link #forEachRemaining(java.util.function.Consumer) forEachRemaining()}
     * does not affect the order in which the values, transformed to boxed values,
     * are encountered.
    
     
     *
     * @apiNote
     * <p>Spliterators, like {@code Iterator}s, are for traversing the elements of
     * a source.  The {@code Spliterator} API was designed to support efficient
     * parallel traversal in addition to sequential traversal, by supporting
     * decomposition as well as single-element iteration.  In addition, the
     * protocol for accessing elements via a Spliterator is designed to impose
     * smaller per-element overhead than {@code Iterator}, an d to avoid the inherent
     * race involved in having separate methods for {@code hasNext()} and
     * {@code next()}.
     Spliterator支持高效的,并行的操作。
     支持解耦,分解,氮元素的遍历。
     此外,通过accessing协议。。。 相对于 Iterator,遍历元素的时候成本更低。
     原因: 之前的{@code hasNext()} and {@code next()}.搭配使用存在竞争。
     现在直接使用一个tryAdvance()方法就解决了这两个方法实现的事情。 
     
     
     *
     * <p>For mutable sources, arbitrary and non-deterministic behavior may occur if
     * the source is structurally interfered with (elements added, replaced, or
     * removed) between the time that the Spliterator binds to its data source and
     * the end of traversal.  For example, such interference will produce arbitrary,
     * non-deterministic results when using the {@code java.util.stream} framework.
     如果源在结构上被修改了(增删改),在绑定迭代器之后和执行完毕之前这段时间内进行任意修改。
     行为就是不确定的了。 
     所以在使用流框架的时候,要求源是不可变的
     
     *
     * <p>Structural interference of a source can be managed in the following ways
     * (in approximate order of decreasing desirability):
     源结构上的修改,是可以通过如下几个方式去修改的
     * <ul>
     * <li>The source cannot be structurally interfered with.
     如,源是不允许被修改的。
     * <br>For example, an instance of
     * {@link java.util.concurrent.CopyOnWriteArrayList} is an immutable source.
     * A Spliterator created from the source reports a characteristic of
     * {@code IMMUTABLE}.</li>
    如:CopyOnWriteArrayList是一个不可变的源。
    
    先拷贝,再追加。  (但是效率会下降。)适合读多写少的操作。
     
     
     * <li>The source manages concurrent modifications.
     源本身自己去管理并发。
     * <br>For example, a key set of a {@link java.util.concurrent.ConcurrentHashMap}
     * is a concurrent source.  A Spliterator created from the source reports a
     * characteristic of {@code CONCURRENT}.</li>
     如:ConcurrentHashMap 。 创建的是并发源
     
     * <li>The mutable source provides a late-binding and fail-fast Spliterator.
     * <br>Late binding narrows the window during which interference can affect
     * the calculation; fail-fast detects, on a best-effort basis, that structural
     * interference has occurred after traversal has commenced and throws
     * {@link ConcurrentModificationException}.  For example, {@link ArrayList},
     * and many other non-concurrent {@code Collection} classes in the JDK, provide
     * a late-binding, fail-fast spliterator.</li>
     可变的源提供了延迟绑定和快速失败的迭代分割器。
     会限制时间点的缩短。
     如果在遍历中修改,则会抛出ConcurrentModificationException。
     
     * <li>The mutable source provides a non-late-binding but fail-fast Spliterator.
     * <br>The source increases the likelihood of throwing
     * {@code ConcurrentModificationException} since the window of potential
     * interference is larger.</li>
     
     
     * <li>The mutable source provides a late-binding and non-fail-fast Spliterator.
     * <br>The source risks arbitrary, non-deterministic behavior after traversal
     * has commenced since interference is not detected.
     
     
     * </li>
     * <li>The mutable source provides a non-late-binding and non-fail-fast
     * Spliterator.
     * <br>The source increases the risk of arbitrary, non-deterministic behavior
     * since non-detected interference may occur after construction.
     * </li>
     总结。
     1. 源是不是并发的
     2. 是不是快速绑定的,是不是快速失败的(2*2 组合的四种情况。)
     
     * </ul>
     *
     
     串行案例
     * <p><b>Example.</b> Here is a class (not a very useful one, except
     * for illustration) that maintains an array in which the actual data
     * are held in even locations, and unrelated tag data are held in odd
     * locations. Its Spliterator ignores the tags.
     如:类维护了一个数组。 实际的数据是在偶数的位置上存放。不想管的标签数据是存放在奇数位置上。
     
     *
     * <pre> {@code
     * class TaggedArray<T> {
     *   private final Object[] elements; // immutable after construction
     *   TaggedArray(T[] data, Object[] tags) {
     *     int size = data.length;
     *     if (tags.length != size) throw new IllegalArgumentException();
     *     this.elements = new Object[2 * size];
     *     for (int i = 0, j = 0; i < size; ++i) {
     *       elements[j++] = data[i];
     *       elements[j++] = tags[i];
     *     }
     *   }
     *
     *   public Spliterator<T> spliterator() {
     *     return new TaggedArraySpliterator<>(elements, 0, elements.length);
     *   }
     *
     *   static class TaggedArraySpliterator<T> implements Spliterator<T> {
     *     private final Object[] array;
     *     private int origin; // current index, advanced on split or traversal
     *     private final int fence; // one past the greatest index
     *
     *     TaggedArraySpliterator(Object[] array, int origin, int fence) {
     *       this.array = array; this.origin = origin; this.fence = fence;
     *     }
     *
     *     public void forEachRemaining(Consumer<? super T> action) {
     *       for (; origin < fence; origin += 2)
     *         action.accept((T) array[origin]);
     *     }
     *
     *     public boolean tryAdvance(Consumer<? super T> action) {
     *       if (origin < fence) {
     *         action.accept((T) array[origin]);
     *         origin += 2;
     *         return true;
     *       }
     *       else // cannot advance
     *         return false;
     *     }
     *
     *     public Spliterator<T> trySplit() {
     *       int lo = origin; // divide range in half
     *       int mid = ((lo + fence) >>> 1) & ~1; // force midpoint to be even
     *       if (lo < mid) { // split out left half
     *         origin = mid; // reset this Spliterator's origin
     *         return new TaggedArraySpliterator<>(array, lo, mid);
     *       }
     *       else       // too small to split
     *         return null;
     *     }
     *
     *     public long estimateSize() {
     *       return (long)((fence - origin) / 2);
     *     }
     *
     *     public int characteristics() {
     *       return ORDERED | SIZED | IMMUTABLE | SUBSIZED;
     *     }
     *   }
     * }}</pre>
     *
     
     并行案例:
     * <p>As an example how a parallel computation framework, such as the
     * {@code java.util.stream} package, would use Spliterator in a parallel
     * computation, here is one way to implement an associated parallel forEach,
     * that illustrates the primary usage idiom of splitting off subtasks until
     * the estimated amount of work is small enough to perform
     * sequentially. Here we assume that the order of processing across
     * subtasks doesn't matter; different (forked) tasks may further split
     * and process elements concurrently in undetermined order.  This
     * example uses a {@link java.util.concurrent.CountedCompleter};
     * similar usages apply to other parallel task constructions.
     *
     * <pre>{@code
     * static <T> void parEach(TaggedArray<T> a, Consumer<T> action) {
     *   Spliterator<T> s = a.spliterator();
     *   long targetBatchSize = s.estimateSize() / (ForkJoinPool.getCommonPoolParallelism() * 8);
     *   new ParEach(null, s, action, targetBatchSize).invoke();
     * }
     *
     * static class ParEach<T> extends CountedCompleter<Void> {
     *   final Spliterator<T> spliterator;
     *   final Consumer<T> action;
     *   final long targetBatchSize;
     *
     *   ParEach(ParEach<T> parent, Spliterator<T> spliterator,
     *           Consumer<T> action, long targetBatchSize) {
     *     super(parent);
     *     this.spliterator = spliterator; this.action = action;
     *     this.targetBatchSize = targetBatchSize;
     *   }
     *
     *   public void compute() {
     *     Spliterator<T> sub;
     *     while (spliterator.estimateSize() > targetBatchSize &&
     *            (sub = spliterator.trySplit()) != null) {
     *       addToPendingCount(1);
     *       new ParEach<>(this, sub, action, targetBatchSize).fork();
     *     }
     *     spliterator.forEachRemaining(action);
     *     propagateCompletion();
     *   }
     * }}</pre>
     *
     * @implNote
     * If the boolean system property {@code org.openjdk.java.util.stream.tripwire}
     * is set to {@code true} then diagnostic warnings are reported if boxing of
     * primitive values occur when operating on primitive subtype specializations.
     *
     * @param <T> the type of elements returned by this Spliterator
     *
     * @see Collection
     * @since 1.8
     */
    
    • serial-thread-confinement

    我们再来看看Spliterator类中的方法

    1. tryAdvance() 尝试遍历,对元素执行动作。
        /**
         * If a remaining element exists, performs the given action on it,
         * returning {@code true}; else returns {@code false}.  If this
         * Spliterator is {@link #ORDERED} the action is performed on the
         * next element in encounter order.  Exceptions thrown by the
         * action are relayed to the caller.
         *
         * @param action The action
         * @return {@code false} if no remaining elements existed
         * upon entry to this method, else {@code true}.
         * @throws NullPointerException if the specified action is null
         */
        boolean tryAdvance(Consumer<? super T> action);
    
    1. forEachRemaining() 。通过函数式接口 调用tryAdvance().
        /**
         * Performs the given action for each remaining element, sequentially in
         * the current thread, until all elements have been processed or the action
         * throws an exception.  If this Spliterator is {@link #ORDERED}, actions
         * are performed in encounter order.  Exceptions thrown by the action
         * are relayed to the caller.
         *
         * @implSpec
         * The default implementation repeatedly invokes {@link #tryAdvance} until
         * it returns {@code false}.  It should be overridden whenever possible.
         *
         * @param action The action
         * @throws NullPointerException if the specified action is null
         */
        default void forEachRemaining(Consumer<? super T> action) {
            do { } while (tryAdvance(action));
        }
    
    1. trySplit() 尝试进行分割
    /**
         * If this spliterator can be partitioned, returns a Spliterator
         * covering elements, that will, upon return from this method, not
         * be covered by this Spliterator.
         如果这个分割迭代器能够被进行分割。就会返回一个 涵盖这个元素的Spliterator,
         分割出来的新的Spliterator可能会被继续分割,剩下的继续又当前的Spliterator涵盖
         *
         * <p>If this Spliterator is {@link #ORDERED}, the returned Spliterator
         * must cover a strict prefix of the elements.
         如果  Spliterator is {@link #ORDERED}。返回的必须是ORDERED的
         
         *
         * <p>Unless this Spliterator covers an infinite number of elements,
         * repeated calls to {@code trySplit()} must eventually return {@code null}.
         除非这个 Spliterator 涵盖的事一个无限的元素。 
         否则,必须被确认返回个数是确定的。
         重复的去继续分割,分割到不能再分割。 (一定会有这样的情况。)
         
         * Upon non-null return:
         * <ul>
         * <li>the value reported for {@code estimateSize()} before splitting,
         * must, after splitting, be greater than or equal to {@code estimateSize()}
         * for this and the returned Spliterator; and</li>
         * <li>if this Spliterator is {@code SUBSIZED}, then {@code estimateSize()}
         * for this spliterator before splitting must be equal to the sum of
         * {@code estimateSize()} for this and the returned Spliterator after
         * splitting.</li>
         * </ul>
         如果不会空:
         分割前的  estimateSize()的返回值,必须大于等于分割之后estimateSize()的返回值。
         
         如果 Spliterator is {@code SUBSIZED},那么 分割之前  estimateSize()的大小,必须等于 分割之后的 estimateSize() 和返回来的值的大小。(分割前后:必须 8 = 4+4.)
         
         *
         * <p>This method may return {@code null} for any reason,
         * including emptiness, inability to split after traversal has
         * commenced, data structure constraints, and efficiency
         * considerations.
         这个放个出于以下原因,都会返回Null值
         1. emptiness
         
         *
         * @apiNote
         * An ideal {@code trySplit} method efficiently (without
         * traversal) divides its elements exactly in half, allowing
         * balanced parallel computation.  Many departures from this ideal
         * remain highly effective; for example, only approximately
         * splitting an approximately balanced tree, or for a tree in
         * which leaf nodes may contain either one or two elements,
         * failing to further split these nodes.  However, large
         * deviations in balance and/or overly inefficient {@code
         * trySplit} mechanics typically result in poor parallel
         * performance.
         @API文档
         一种理想的trySplit()方法,会恰好将元素分为两半。允许平衡的并行计算。
         很多情况下违背了这种理想的情况。
         比如说:只是分割一个嫉妒不平衡的一个数,数中只有两个节点。等。不能再次进行分割。
         然而,很不平衡的这种机制,会导致并发效率的极度降低。
         
         *
         * @return a {@code Spliterator} covering some portion of the
         * elements, or {@code null} if this spliterator cannot be split
         返回一个Spliterator
         
         */
        Spliterator<T> trySplit();
    
    1. estimateSize() 估算大小。
        /**
         * Returns an estimate of the number of elements that would be
         * encountered by a {@link #forEachRemaining} traversal, or returns {@link
         * Long#MAX_VALUE} if infinite, unknown, or too expensive to compute.
         返回元素数量的估算值。(会被forEachRemaining 遍历的元素)
         infinite, unknown, or too expensive to compute.这几种情况会返回:MAX_VALUE
         
         *
         * <p>If this Spliterator is {@link #SIZED} and has not yet been partially
         * traversed or split, or this Spliterator is {@link #SUBSIZED} and has
         * not yet been partially traversed, this estimate must be an accurate
         * count of elements that would be encountered by a complete traversal.
         * Otherwise, this estimate may be arbitrarily inaccurate, but must decrease
         * as specified across invocations of {@link #trySplit}.
         如果Spliterator是SIZED 或者是SUBSIZED 。那个 这个元素的estimate值一定是精确的。
         然而,必须要减少 trySplit 的调用。
         
         *
         * @apiNote
         * Even an inexact estimate is often useful and inexpensive to compute.
         * For example, a sub-spliterator of an approximately balanced binary tree
         * may return a value that estimates the number of elements to be half of
         * that of its parent; if the root Spliterator does not maintain an
         * accurate count, it could estimate size to be the power of two
         * corresponding to its maximum depth.
         甚至一个不太精确的估算,也是有用的。
         
         *
         * @return the estimated size, or {@code Long.MAX_VALUE} if infinite,
         *         unknown, or too expensive to compute.
         */
        long estimateSize();
    
    1. getExactSizeIfKnown() 如果知道的话就会返回确定的大小。
        /**
         * Convenience method that returns {@link #estimateSize()} if this
         * Spliterator is {@link #SIZED}, else {@code -1}.
         如果Spliterator是SIZED的话, estimateSize就会返回确定的大小。
         
         * @implSpec
         * The default implementation returns the result of {@code estimateSize()}
         * if the Spliterator reports a characteristic of {@code SIZED}, and
         * {@code -1} otherwise.
         *
         * @return the exact size, if known, else {@code -1}.
         */
        default long getExactSizeIfKnown() {
            return (characteristics() & SIZED) == 0 ? -1L : estimateSize();
        }
    
    1. characteristics() 特性值。
    /**
         * Returns a set of characteristics of this Spliterator and its
         * elements. The result is represented as ORed values from {@link
         * #ORDERED}, {@link #DISTINCT}, {@link #SORTED}, {@link #SIZED},
         * {@link #NONNULL}, {@link #IMMUTABLE}, {@link #CONCURRENT},
         * {@link #SUBSIZED}.  Repeated calls to {@code characteristics()} on
         * a given spliterator, prior to or in-between calls to {@code trySplit},
         * should always return the same result.
         返回这个Spliterator的特性值的集合。
         {@link#ORDERED},
         {@link #DISTINCT}, 
         {@link #SORTED}, 
         {@link #SIZED},
         {@link #NONNULL},
         {@link #IMMUTABLE}, 
         {@link #CONCURRENT},
         {@link #SUBSIZED}
         这8个,在下面有定义。
         
         *
         * <p>If a Spliterator reports an inconsistent set of
         * characteristics (either those returned from a single invocation
         * or across multiple invocations), no guarantees can be made
         * about any computation using this Spliterator.
         *
         * @apiNote The characteristics of a given spliterator before splitting
         * may differ from the characteristics after splitting.  For specific
         * examples see the characteristic values {@link #SIZED}, {@link #SUBSIZED}
         * and {@link #CONCURRENT}.
         具体的例子看下面的说明。
         
         *
         * @return a representation of characteristics
         */
        int characteristics();
    
    1. hasCharacteristics(int characteristics) 查看是否包含给定的特性值
    /**
         * Returns {@code true} if this Spliterator's {@link
         * #characteristics} contain all of the given characteristics.
         *
         * @implSpec
         * The default implementation returns true if the corresponding bits
         * of the given characteristics are set.
         *
         * @param characteristics the characteristics to check for
         * @return {@code true} if all the specified characteristics are present,
         * else {@code false}
         */
        default boolean hasCharacteristics(int characteristics) {
            return (characteristics() & characteristics) == characteristics;
        }
    
    1. getComparator() :抛出一个不可实现的状态异常。
    /**
         * If this Spliterator's source is {@link #SORTED} by a {@link Comparator},
         * returns that {@code Comparator}. If the source is {@code SORTED} in
         * {@linkplain Comparable natural order}, returns {@code null}.  Otherwise,
         * if the source is not {@code SORTED}, throws {@link IllegalStateException}.
         如果源是有序的,返回用于排序的  Comparator
         如果是按照自然排序的,就返回空 (就不需要比较器)
         否则就抛出异常,
         *
         * @implSpec
         * The default implementation always throws {@link IllegalStateException}.
         *
         * @return a Comparator, or {@code null} if the elements are sorted in the
         * natural order.
         * @throws IllegalStateException if the spliterator does not report
         *         a characteristic of {@code SORTED}.
         */
        default Comparator<? super T> getComparator() {
            throw new IllegalStateException();
        }
    
    1. 8个特性值
    ORDERED
    DISTINCT
    SORTED
    SIZED
    NONNULL
    IMMUTABLE
    CONCURRENT
    SUBSIZED
       //更多的是用在并发的时候,指定执行哪些内容。
        
    

    我们再来看看Spliterator中的8种Characteristic

    /**
         * Characteristic value signifying that an encounter order is defined for
         * elements. If so, this Spliterator guarantees that method
         * {@link #trySplit} splits a strict prefix of elements, that method
         * {@link #tryAdvance} steps by one element in prefix order, and that
         * {@link #forEachRemaining} performs actions in encounter order.
         *
         * <p>A {@link Collection} has an encounter order if the corresponding
         * {@link Collection#iterator} documents an order. If so, the encounter
         * order is the same as the documented order. Otherwise, a collection does
         * not have an encounter order.
         *
         * @apiNote Encounter order is guaranteed to be ascending index order for
         * any {@link List}. But no order is guaranteed for hash-based collections
         * such as {@link HashSet}. Clients of a Spliterator that reports
         * {@code ORDERED} are expected to preserve ordering constraints in
         * non-commutative parallel computations.
         */
        public static final int ORDERED    = 0x00000010;
    
        /**
         * Characteristic value signifying that, for each pair of
         * encountered elements {@code x, y}, {@code !x.equals(y)}. This
         * applies for example, to a Spliterator based on a {@link Set}.
         */
        public static final int DISTINCT   = 0x00000001;
    
        /**
         * Characteristic value signifying that encounter order follows a defined
         * sort order. If so, method {@link #getComparator()} returns the associated
         * Comparator, or {@code null} if all elements are {@link Comparable} and
         * are sorted by their natural ordering.
         *
         * <p>A Spliterator that reports {@code SORTED} must also report
         * {@code ORDERED}.
         *
         * @apiNote The spliterators for {@code Collection} classes in the JDK that
         * implement {@link NavigableSet} or {@link SortedSet} report {@code SORTED}.
         */
        public static final int SORTED     = 0x00000004;
    
        /**
         * Characteristic value signifying that the value returned from
         * {@code estimateSize()} prior to traversal or splitting represents a
         * finite size that, in the absence of structural source modification,
         * represents an exact count of the number of elements that would be
         * encountered by a complete traversal.
         在执行遍历或者分割之前,由estimateSize返回的值,表示一个有序的大小。
         表示元素的数量的精确的值。
         *
         * @apiNote Most Spliterators for Collections, that cover all elements of a
         * {@code Collection} report this characteristic. Sub-spliterators, such as
         * those for {@link HashSet}, that cover a sub-set of elements and
         * approximate their reported size do not.
         大部分对于Collections的分割迭代器,一般都会有这个特性值。
         */
    
        public static final int SIZED      = 0x00000040;
    
        /**
         * Characteristic value signifying that the source guarantees that
         * encountered elements will not be {@code null}. (This applies,
         * for example, to most concurrent collections, queues, and maps.)
         */
        public static final int NONNULL    = 0x00000100;
    
        /**
         * Characteristic value signifying that the element source cannot be
         * structurally modified; that is, elements cannot be added, replaced, or
         * removed, so such changes cannot occur during traversal. A Spliterator
         * that does not report {@code IMMUTABLE} or {@code CONCURRENT} is expected
         * to have a documented policy (for example throwing
         * {@link ConcurrentModificationException}) concerning structural
         * interference detected during traversal.
         指定元素的源是不能被修改的,不能被(be added, replaced, or removed)。
         
         在执行的时候,如果发现被修改,没有返回,则会抛出ConcurrentModificationException并发修改异常。
         */
        public static final int IMMUTABLE  = 0x00000400;
    
        /**
         * Characteristic value signifying that the element source may be safely
         * concurrently modified (allowing additions, replacements, and/or removals)
         * by multiple threads without external synchronization. If so, the
         * Spliterator is expected to have a documented policy concerning the impact
         * of modifications during traversal.
         表示元素的源能够安全的被并发修改。允许 modified (allowing additions, replacements, and/or removals)。
         不需要外部的同步化的操作。Spliterator的提供了允许被修改的策略。
         
         *
         * <p>A top-level Spliterator should not report both {@code CONCURRENT} and
         * {@code SIZED}, since the finite size, if known, may change if the source
         * is concurrently modified during traversal. Such a Spliterator is
         * inconsistent and no guarantees can be made about any computation using
         * that Spliterator. Sub-spliterators may report {@code SIZED} if the
         * sub-split size is known and additions or removals to the source are not
         * reflected when traversing.
         顶层的Spliterator 不应该同时返回:{@code CONCURRENT} and {@code SIZED}。
         因为两者之间存在一定的矛盾性。
         这个的Spliterator 是不一直到,
         得到的Sub-spliterators 可能会返回SIZED。
         
         
         *
         * @apiNote Most concurrent collections maintain a consistency policy
         * guaranteeing accuracy with respect to elements present at the point of
         * Spliterator construction, but possibly not reflecting subsequent
         * additions or removals.
         大多是的这种并发性的集合,都会被维护一定的策略。
         :原有的Spliterator ,不会去影响子的Spliterator
         */
        public static final int CONCURRENT = 0x00001000;
    
        /**
         * Characteristic value signifying that all Spliterators resulting from
         * {@code trySplit()} will be both {@link #SIZED} and {@link #SUBSIZED}.
         * (This means that all child Spliterators, whether direct or indirect, will
         * be {@code SIZED}.)
         
         *
         * <p>A Spliterator that does not report {@code SIZED} as required by
         * {@code SUBSIZED} is inconsistent and no guarantees can be made about any
         * computation using that Spliterator.
         A Spliterator如果没有返回要求的SIZED。 是没有明确的保证的。
         *
         * @apiNote Some spliterators, such as the top-level spliterator for an
         * approximately balanced binary tree, will report {@code SIZED} but not
         * {@code SUBSIZED}, since it is common to know the size of the entire tree
         * but not the exact sizes of subtrees.
         有一些Spliterator。如二叉树的整个树的大小,我们得知总的数,但是不知道子的数。
         */
        public static final int SUBSIZED = 0x00004000;
    

    以上就是关于spliterator的interface所有内容。

    Spliterator都支持哪些事情?上面的8个方法。就是具体功能的实现。

    OfPrimitive

    专门针对于原生的迭代器(int, long, double)

    /**
         * A Spliterator specialized for primitive values.
         *
         * @param <T> the type of elements returned by this Spliterator.  The
         * type must be a wrapper type for a primitive type, such as {@code Integer}
         * for the primitive {@code int} type.
         * @param <T_CONS> the type of primitive consumer.  The type must be a
         * primitive specialization of {@link java.util.function.Consumer} for
         * {@code T}, such as {@link java.util.function.IntConsumer} for
         * {@code Integer}.
         * @param <T_SPLITR> the type of primitive Spliterator.  The type must be
         * a primitive specialization of Spliterator for {@code T}, such as
         * {@link Spliterator.OfInt} for {@code Integer}.
         *
         * @see Spliterator.OfInt
         * @see Spliterator.OfLong
         * @see Spliterator.OfDouble
         * @since 1.8
         */
    public interface OfPrimitive<T, T_CONS, T_SPLITR extends Spliterator.OfPrimitive<T, T_CONS, T_SPLITR>>
                extends Spliterator<T> {
            @Override
            T_SPLITR trySplit();
    
            /**
             * If a remaining element exists, performs the given action on it,
             * returning {@code true}; else returns {@code false}.  If this
             * Spliterator is {@link #ORDERED} the action is performed on the
             * next element in encounter order.  Exceptions thrown by the
             * action are relayed to the caller.
             *
             * @param action The action
             * @return {@code false} if no remaining elements existed
             * upon entry to this method, else {@code true}.
             * @throws NullPointerException if the specified action is null
             */
            @SuppressWarnings("overloads")
            boolean tryAdvance(T_CONS action);
    
            /**
             * Performs the given action for each remaining element, sequentially in
             * the current thread, until all elements have been processed or the
             * action throws an exception.  If this Spliterator is {@link #ORDERED},
             * actions are performed in encounter order.  Exceptions thrown by the
             * action are relayed to the caller.
             *
             * @implSpec
             * The default implementation repeatedly invokes {@link #tryAdvance}
             * until it returns {@code false}.  It should be overridden whenever
             * possible.
             *
             * @param action The action
             * @throws NullPointerException if the specified action is null
             */
            @SuppressWarnings("overloads")
            default void forEachRemaining(T_CONS action) {
                do { } while (tryAdvance(action));
            }
        }
    

    提供了三个特化版本。实现了OfPrimitive接口。

    1. OfInt
    2. OfLong
    3. OfDouble
    OfInt
    public interface OfInt extends OfPrimitive<Integer, IntConsumer, OfInt> {
    
            @Override
            OfInt trySplit();
    
            @Override
            boolean tryAdvance(IntConsumer action);
    
            @Override
            default void forEachRemaining(IntConsumer action) {
                do { } while (tryAdvance(action));
            }
    
            /**
             * {@inheritDoc}
             * @implSpec
             * If the action is an instance of {@code IntConsumer} then it is cast
             * to {@code IntConsumer} and passed to
             * {@link #tryAdvance(java.util.function.IntConsumer)}; otherwise
             * the action is adapted to an instance of {@code IntConsumer}, by
             * boxing the argument of {@code IntConsumer}, and then passed to
             * {@link #tryAdvance(java.util.function.IntConsumer)}.
             */
            @Override
            default boolean tryAdvance(Consumer<? super Integer> action) {
                if (action instanceof IntConsumer) {
                    return tryAdvance((IntConsumer) action);
                }
                else {
                    if (Tripwire.ENABLED)
                        Tripwire.trip(getClass(),
                                      "{0} calling Spliterator.OfInt.tryAdvance((IntConsumer) action::accept)");
                    return tryAdvance((IntConsumer) action::accept);
                }
            }
    
            /**
             * {@inheritDoc}
             * @implSpec
             * If the action is an instance of {@code IntConsumer} then it is cast
             * to {@code IntConsumer} and passed to
             * {@link #forEachRemaining(java.util.function.IntConsumer)}; otherwise
             * the action is adapted to an instance of {@code IntConsumer}, by
             * boxing the argument of {@code IntConsumer}, and then passed to
             * {@link #forEachRemaining(java.util.function.IntConsumer)}.
             */
            @Override
            default void forEachRemaining(Consumer<? super Integer> action) {
                if (action instanceof IntConsumer) {
                    forEachRemaining((IntConsumer) action);
                }
                else {
                    if (Tripwire.ENABLED)
                        Tripwire.trip(getClass(),
                                      "{0} calling Spliterator.OfInt.forEachRemaining((IntConsumer) action::accept)");
                    forEachRemaining((IntConsumer) action::accept);
                }
            }
        }
    

    问题:要知道Consumer和IntConsumer是没有任何继承关系的话,他们是怎么实现类型转换的呢?

     default boolean tryAdvance(Consumer<? super Integer> action) {
                if (action instanceof IntConsumer) {
                    return tryAdvance((IntConsumer) action);
                }
    

    如果是纯粹的面向对象的,这种现象是完全不能够存在的。

    但是如果是在这函数式编程的情况下,是能够存在的。

    原因如下:

    1. java中存在自动装箱和拆箱的操作 (int->Integer)
    2. 强制类型的转换在纯粹的面向对象是一定要存在继承关系的,根本原因还在于函数式编程的lambda上面
    3. lambda的一切信息都是通过上下文推断出来的。(对于同一个lambda表达式,在不同类型中可能推断出来的结果是不同的。在函数式编程中,这种现象是存在的。)

    用代码来解释。

    
    
  • 相关阅读:
    聊聊MySQL的索引吧
    污力满满的技术解读,瞬间印象深刻
    lua语言(1):安装、基本结构、函数、输入输出
    pandas中的那些让人有点懵逼的异常(坑向)
    与分布式相关的面试题
    图解IP基础知识
    Date类
    String 与StringBuffer习题
    Java的常用类 String
    线程练习题
  • 原文地址:https://www.cnblogs.com/bigbaby/p/12159495.html
Copyright © 2011-2022 走看看