zoukankan      html  css  js  c++  java
  • Java8集合框架——LinkedHashMap源码分析

    本文的结构如下:

    一、LinkedHashMap 的 Javadoc 文档注释和简要说明

      先膜拜下 LinkedHashMap 的 Javadoc,只能说很佩服,这文档注释把 LinkedHashMap 的主要特点都罗列出来了。看懂这注释,然后再对照源码,可以理解个七七八八八,也不会奇怪说各路总结那么多,都是哪来的。以下是 Javadoc 的几点摘抄:

    • LinkedHashMap 是 Map 接口的 hash table 和 linked list 实现类,内部所有节点维护了双链表,迭代顺序可预测,默认按照插入顺序进行迭代输出(已存在的 k 重新 put 不影响顺序,因为 m.containsKey(k) 会先返回 true ),这种特性对于需要有序的 Map 参数来说很有用,而且效率优于 TreeMap。
    • LinkedHashMap 还提供了构造器用于指定按照访问顺序进行迭代输出,即按照最近最少访问到最近访问的访问顺序:from least-recently accessed to most-recently (access-order)。这种特性适合做 LRU 缓存(least-recently used cache),即继承 LinkedHashMap ,重写 removeEldestEntry(Map.Entry) 方法来指定什么时候移除的策略。
    • LinkedHashMap 继承了 HashMap,基本操作(add, contains and remove)可以认为是O(1),因需要维护双链表,性能可能会略低于 HashMap,但是有一个例外:LinkedHashMap 的迭代只与实际大小有关(毕竟可以依靠双链表进行迭代),而 HashMap 的迭代则与容量有关,性能会相对低于 LinkedHashMap。
    • 同样不适合多线程操作,需要额外进行同步,比如使用 Collections.synchronizedMap 。
    • 迭代器也是 fail-fast,而且并不保证出现有并发修改就百分百抛出 ConcurrentModificationException,而是尽可能检查到,因此只适用于检测 bug(抛出 ConcurrentModificationException 说明有问题,但是没有抛出来不能说明没问题)。

      可以看出,LinkedHashMap 有 2个 主要用途:

    • 有序的 HashMap
    • LRU cache

    LinkedHashMap 的 Javadoc:

    /**
     * <p>Hash table and linked list implementation of the <tt>Map</tt> interface,
     * with predictable iteration order.  This implementation differs from
     * <tt>HashMap</tt> in that it maintains a doubly-linked list running through
     * all of its entries.  This linked list defines the iteration ordering,
     * which is normally the order in which keys were inserted into the map
     * (<i>insertion-order</i>).  Note that insertion order is not affected
     * if a key is <i>re-inserted</i> into the map.  (A key <tt>k</tt> is
     * reinserted into a map <tt>m</tt> if <tt>m.put(k, v)</tt> is invoked when
     * <tt>m.containsKey(k)</tt> would return <tt>true</tt> immediately prior to
     * the invocation.)
     *
     * <p>This implementation spares its clients from the unspecified, generally
     * chaotic ordering provided by {@link HashMap} (and {@link Hashtable}),
     * without incurring the increased cost associated with {@link TreeMap}.  It
     * can be used to produce a copy of a map that has the same order as the
     * original, regardless of the original map's implementation:
     * <pre>
     *     void foo(Map m) {
     *         Map copy = new LinkedHashMap(m);
     *         ...
     *     }
     * </pre>
     * This technique is particularly useful if a module takes a map on input,
     * copies it, and later returns results whose order is determined by that of
     * the copy.  (Clients generally appreciate having things returned in the same
     * order they were presented.)
     *
     * <p>A special {@link #LinkedHashMap(int,float,boolean) constructor} is
     * provided to create a linked hash map whose order of iteration is the order
     * in which its entries were last accessed, from least-recently accessed to
     * most-recently (<i>access-order</i>).  This kind of map is well-suited to
     * building LRU caches.  Invoking the {@code put}, {@code putIfAbsent},
     * {@code get}, {@code getOrDefault}, {@code compute}, {@code computeIfAbsent},
     * {@code computeIfPresent}, or {@code merge} methods results
     * in an access to the corresponding entry (assuming it exists after the
     * invocation completes). The {@code replace} methods only result in an access
     * of the entry if the value is replaced.  The {@code putAll} method generates one
     * entry access for each mapping in the specified map, in the order that
     * key-value mappings are provided by the specified map's entry set iterator.
     * <i>No other methods generate entry accesses.</i>  In particular, operations
     * on collection-views do <i>not</i> affect the order of iteration of the
     * backing map.
     *
     * <p>The {@link #removeEldestEntry(Map.Entry)} method may be overridden to
     * impose a policy for removing stale mappings automatically when new mappings
     * are added to the map.
     *
     * <p>This class provides all of the optional <tt>Map</tt> operations, and
     * permits null elements.  Like <tt>HashMap</tt>, it provides constant-time
     * performance for the basic operations (<tt>add</tt>, <tt>contains</tt> and
     * <tt>remove</tt>), assuming the hash function disperses elements
     * properly among the buckets.  Performance is likely to be just slightly
     * below that of <tt>HashMap</tt>, due to the added expense of maintaining the
     * linked list, with one exception: Iteration over the collection-views
     * of a <tt>LinkedHashMap</tt> requires time proportional to the <i>size</i>
     * of the map, regardless of its capacity.  Iteration over a <tt>HashMap</tt>
     * is likely to be more expensive, requiring time proportional to its
     * <i>capacity</i>.
     *
     * <p>A linked hash map has two parameters that affect its performance:
     * <i>initial capacity</i> and <i>load factor</i>.  They are defined precisely
     * as for <tt>HashMap</tt>.  Note, however, that the penalty for choosing an
     * excessively high value for initial capacity is less severe for this class
     * than for <tt>HashMap</tt>, as iteration times for this class are unaffected
     * by capacity.
     *
     * <p><strong>Note that this implementation is not synchronized.</strong>
     * If multiple threads access a linked hash map concurrently, and at least
     * one of the threads modifies the map structurally, it <em>must</em> be
     * synchronized externally.  This is typically accomplished by
     * synchronizing on some object that naturally encapsulates the map.
     *
     * If no such object exists, the map should be "wrapped" using the
     * {@link Collections#synchronizedMap Collections.synchronizedMap}
     * method.  This is best done at creation time, to prevent accidental
     * unsynchronized access to the map:<pre>
     *   Map m = Collections.synchronizedMap(new LinkedHashMap(...));</pre>
     *
     * A structural modification is any operation that adds or deletes one or more
     * mappings or, in the case of access-ordered linked hash maps, affects
     * iteration order.  In insertion-ordered linked hash maps, merely changing
     * the value associated with a key that is already contained in the map is not
     * a structural modification.  <strong>In access-ordered linked hash maps,
     * merely querying the map with <tt>get</tt> is a structural modification.
     * </strong>)
     *
     * <p>The iterators returned by the <tt>iterator</tt> method of the collections
     * returned by all of this class's collection view methods are
     * <em>fail-fast</em>: if the map is structurally modified at any time after
     * the iterator is created, in any way except through the iterator's own
     * <tt>remove</tt> method, the iterator will throw a {@link
     * ConcurrentModificationException}.  Thus, in the face of concurrent
     * modification, the iterator fails quickly and cleanly, rather than risking
     * arbitrary, non-deterministic behavior at an undetermined time in the future.
     *
     * <p>Note that the fail-fast behavior of an iterator cannot be guaranteed
     * as it is, generally speaking, impossible to make any hard guarantees in the
     * presence of unsynchronized concurrent modification.  Fail-fast iterators
     * throw <tt>ConcurrentModificationException</tt> on a best-effort basis.
     * Therefore, it would be wrong to write a program that depended on this
     * exception for its correctness:   <i>the fail-fast behavior of iterators
     * should be used only to detect bugs.</i>
     *
     * <p>The spliterators returned by the spliterator method of the collections
     * returned by all of this class's collection view methods are
     * <em><a href="Spliterator.html#binding">late-binding</a></em>,
     * <em>fail-fast</em>, and additionally report {@link Spliterator#ORDERED}.
     *
     * <p>This class is a member of the
     * <a href="{@docRoot}/../technotes/guides/collections/index.html">
     * Java Collections Framework</a>.
     *
     * @implNote
     * The spliterators returned by the spliterator method of the collections
     * returned by all of this class's collection view methods are created from
     * the iterators of the corresponding collections.
     *
     * @param <K> the type of keys maintained by this map
     * @param <V> the type of mapped values
     *
     * @author  Josh Bloch
     * @see     Object#hashCode()
     * @see     Collection
     * @see     Map
     * @see     HashMap
     * @see     TreeMap
     * @see     Hashtable
     * @since   1.4
     */

      

    二、LinkedHashMap 的内部实现:一些扩展属性和构造函数

      LinkedHashMap 继承了 HashMap,这里重点说下 LinkedHashMap 在内部属性和构造函数方面扩展的部分。

    1、扩展的属性和内部类

      可以初步看出内部的一些变化,比如增加了首节点和尾节点的记录,内部节点元素增加了 before 和 after 节点。这些都是维持双链表需要用到的。另外就是 accessOrder ,用于指定是否按照 访问顺序(设置为 true) 排序(默认 false 是插入顺序)。

        /**
         * HashMap.Node subclass for normal LinkedHashMap entries.
         * LinkedHashMap 的内部节点实现类,这里增加了 before 和 after 节点,用于维护 doubly-linked list
         * 这里继承了 HashMap.Node ,保证新节点的类型一致,都是 HashMap.Node
         */
        static class Entry<K,V> extends HashMap.Node<K,V> {
            Entry<K,V> before, after;
            Entry(int hash, K key, V value, Node<K,V> next) {
                super(hash, key, value, next);
            }
        }
    
        /**
         * The head (eldest) of the doubly linked list.
         * 首节点元素(最早插入/最近最早访问过的)
         */
        transient LinkedHashMap.Entry<K,V> head;
    
        /**
         * The tail (youngest) of the doubly linked list.
         * 尾节点元素(最晚插入/最近访问的)
         */
        transient LinkedHashMap.Entry<K,V> tail;
    
        /**
         * The iteration ordering method for this linked hash map: <tt>true</tt>
         * for access-order, <tt>false</tt> for insertion-order.
         * 迭代器的顺序控制
         * true:根据访问顺序
         * false:默认场景,根据插入顺序
         * @serial
         */
        final boolean accessOrder;

      

    2、构造函数

      和 HashMap 构造函数的差别主要是 accessOrder 的设置。

        /**
         * Constructs an empty insertion-ordered <tt>LinkedHashMap</tt> instance
         * with the specified initial capacity and load factor.
         *
         * 指定 初始容量 和 负载因子 ,同时默认为 插入顺序
         * @param  initialCapacity the initial capacity
         * @param  loadFactor      the load factor
         * @throws IllegalArgumentException if the initial capacity is negative
         *         or the load factor is nonpositive
         */
        public LinkedHashMap(int initialCapacity, float loadFactor) {
            super(initialCapacity, loadFactor);
            accessOrder = false;
        }
    
        /**
         * Constructs an empty insertion-ordered <tt>LinkedHashMap</tt> instance
         * with the specified initial capacity and a default load factor (0.75).
         *
         * 指定 初始容量 ,默认负载因子 0.75,同时默认为 插入顺序
         * @param  initialCapacity the initial capacity
         * @throws IllegalArgumentException if the initial capacity is negative
         */
        public LinkedHashMap(int initialCapacity) {
            super(initialCapacity);
            accessOrder = false;
        }
    
        /**
         * Constructs an empty insertion-ordered <tt>LinkedHashMap</tt> instance
         * with the default initial capacity (16) and load factor (0.75).
         *
         * 空构造函数,默认初始容量 16,默认负载因子 0.75,同时默认为 插入顺序
         */
        public LinkedHashMap() {
            super();
            accessOrder = false;
        }
    
        /**
         * Constructs an insertion-ordered <tt>LinkedHashMap</tt> instance with
         * the same mappings as the specified map.  The <tt>LinkedHashMap</tt>
         * instance is created with a default load factor (0.75) and an initial
         * capacity sufficient to hold the mappings in the specified map.
         *
         * 通过指定 Map 构造默认为 插入顺序 的 LinkedHashMap
         * @param  m the map whose mappings are to be placed in this map
         * @throws NullPointerException if the specified map is null
         */
        public LinkedHashMap(Map<? extends K, ? extends V> m) {
            super();
            accessOrder = false;
            putMapEntries(m, false);
        }
    
        /**
         * Constructs an empty <tt>LinkedHashMap</tt> instance with the
         * specified initial capacity, load factor and ordering mode.
         *
         * 指定 初始容量、负载因子、排序模式
         * @param  initialCapacity the initial capacity
         * @param  loadFactor      the load factor
         * @param  accessOrder     the ordering mode - <tt>true</tt> for
         *         access-order, <tt>false</tt> for insertion-order
         * @throws IllegalArgumentException if the initial capacity is negative
         *         or the load factor is nonpositive
         */
        public LinkedHashMap(int initialCapacity,
                             float loadFactor,
                             boolean accessOrder) {
            super(initialCapacity, loadFactor);
            this.accessOrder = accessOrder;
        }

      

    三、LinkedHashMap 的 put 操作和扩容

      put 操作直接继承自 HashMap,由于 LinkedHashMap 会涉及到双向链表的处理,这里有几个 注意点/改动点 需要说明下:

    1、重写新节点创建函数 Node<K,V> newNode(int hash, K key, V value, Node<K,V> e),维护双链表

      LinkedHashMap 的节点会有双向链表,因此在这里进行了处理,很明显,新节点即使最后访问也是最新插入的,直接就丢到最后去没毛病,因此链接到了链表最后/最新处。

    // 创建新节点 并将 新节点 链接 到最后
    Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
        LinkedHashMap.Entry<K,V> p =
            new LinkedHashMap.Entry<K,V>(hash, key, value, e);
        linkNodeLast(p);        // 将 新节点 链接 到最后
        return p;
    }
    
    // link at the end of list
    // 将 新节点 链接 到最后
    private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {
        LinkedHashMap.Entry<K,V> last = tail;
        tail = p;
        if (last == null)
            head = p;
        else {
            p.before = last;
            last.after = p;
        }
    }

      

    2、HashMap 中留下来的三个回调函数, LinkedHashMap 都进行了重写

      put 操作中有使用到的是 afterNodeAccess(Node<K,V> p) 和 afterNodeInsertion(boolean evict)。

    • afterNodeAccess(Node<K,V> p) :k 存在的时候进行的操作。如果是根据访问控制顺序,需要将访问到的节点的链接到最后去;
    • afterNodeInsertion(boolean evict) :k 不存在的时候进行的操作。 LRU cache 中可以进行实际的移除节点操作
    // Callbacks to allow LinkedHashMap post-actions
    void afterNodeAccess(Node<K,V> p) { }            // 访问节点后需要进行的操作,如果指定了根据访问顺序控制,则在这里将节点挪到最后
    void afterNodeInsertion(boolean evict) { }       // 插入节点后需要进行的操作,比如 LRU cache 中移除最早的节点
    void afterNodeRemoval(Node<K,V> p) { }           // 移除指定节点

       在 LinkedHashMap 中的实现如下:

    // 移除 e 节点元素后的操作,对于 HashMap ,removeNode 函数已经是移除了节点,这里是 LinkedHashMap 处理节点中和双向链表有关的的 before 和 after
    void afterNodeRemoval(Node<K,V> e) { // unlink
        LinkedHashMap.Entry<K,V> p =
            (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
        // 移除 e 节点本身的链接
        p.before = p.after = null;
        if (b == null)        // 重置 e 节点上一个节点的 after 链接
            head = a;
        else
            b.after = a;
        if (a == null)        // 重置 e 节点下一个节点的 before 链接
            tail = b;
        else
            a.before = b;
    }
    
    // 是否移除最早插入/访问的节点元素
    void afterNodeInsertion(boolean evict) { // possibly remove eldest
        LinkedHashMap.Entry<K,V> first;
        // 最简单的 LRU cache 其实就是重写 removeEldestEntry 什么时候返回 true 的逻辑(比如超过容量限制),然后移除最早插入/访问的节点
        if (evict && (first = head) != null && removeEldestEntry(first)) {
            K key = first.key;
            removeNode(hash(key), key, null, false, true);
        }
    }
    
    // 节点访问后是否将节点挪到最后
    void afterNodeAccess(Node<K,V> e) { // move node to last
        LinkedHashMap.Entry<K,V> last;
        if (accessOrder && (last = tail) != e) {
            LinkedHashMap.Entry<K,V> p =
                (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
            p.after = null;
            if (b == null)        // 重置 e 节点上一个节点的 after 链接
                head = a;
            else
                b.after = a;
            if (a != null)        // 重置 e 节点下一个节点的 before 链接
                a.before = b;
            else
                last = b;
            if (last == null)        // 只有一个 e 节点的场景
                head = p;
            else {
                p.before = last;    // 把 e 节点挪到最后
                last.after = p;
            }
            tail = p;            // 尾节点处理
            ++modCount;
        }
    }

       这里再看看 removeEldestEntry(Map.Entry<K,V> eldest),这个方法是实现 LRU cache 的关键所在,文档注释中其实已经写明了简要应用,也就是检查 Map 的实际大小是否 大于 规定的容量,超过就是返回true,需要进行节点移除,保证集合不超过规定的上限。

    /**
     * Returns <tt>true</tt> if this map should remove its eldest entry.
     * This method is invoked by <tt>put</tt> and <tt>putAll</tt> after
     * inserting a new entry into the map.  It provides the implementor
     * with the opportunity to remove the eldest entry each time a new one
     * is added.  This is useful if the map represents a cache: it allows
     * the map to reduce memory consumption by deleting stale entries.
     *
     * <p>Sample use: this override will allow the map to grow up to 100
     * entries and then delete the eldest entry each time a new entry is
     * added, maintaining a steady state of 100 entries.
     * <pre>
     *     private static final int MAX_ENTRIES = 100;
     *
     *     protected boolean removeEldestEntry(Map.Entry eldest) {
     *        return size() &gt; MAX_ENTRIES;
     *     }
     * </pre>
     *
     * <p>This method typically does not modify the map in any way,
     * instead allowing the map to modify itself as directed by its
     * return value.  It <i>is</i> permitted for this method to modify
     * the map directly, but if it does so, it <i>must</i> return
     * <tt>false</tt> (indicating that the map should not attempt any
     * further modification).  The effects of returning <tt>true</tt>
     * after modifying the map from within this method are unspecified.
     *
     * <p>This implementation merely returns <tt>false</tt> (so that this
     * map acts like a normal map - the eldest element is never removed).
     *
     * @param    eldest The least recently inserted entry in the map, or if
     *           this is an access-ordered map, the least recently accessed
     *           entry.  This is the entry that will be removed it this
     *           method returns <tt>true</tt>.  If the map was empty prior
     *           to the <tt>put</tt> or <tt>putAll</tt> invocation resulting
     *           in this invocation, this will be the entry that was just
     *           inserted; in other words, if the map contains a single
     *           entry, the eldest entry is also the newest.
     * @return   <tt>true</tt> if the eldest entry should be removed
     *           from the map; <tt>false</tt> if it should be retained.
     */
    protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
        return false;
    }

     3、还有一个比较骚的操作就是 HashMap 内部 红黑树节点 TreeNode 是直接继承 LinkedHashMap.Entry,因此这方面的 红黑树转化、扩容等等基本上可以说是无缝对接。

    static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {...}

      红黑树转化和扩容其实只是涉及到内部节点的挪动,双向链表是不用改动的,因此不需要进行操作。

    四、LinkedHashMap 的 get 操作

      增加了 afterNodeAccess(Node<K,V> p) 的调用,对于访问顺序控制 LinkedHashMap,需要将访问的节点挪到最后去。其他的和 HashMap 一样。

        /**
         * Returns the value to which the specified key is mapped,
         * or {@code null} if this map contains no mapping for the key.
         *
         * <p>More formally, if this map contains a mapping from a key
         * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
         * key.equals(k))}, then this method returns {@code v}; otherwise
         * it returns {@code null}.  (There can be at most one such mapping.)
         *
         * <p>A return value of {@code null} does not <i>necessarily</i>
         * indicate that the map contains no mapping for the key; it's also
         * possible that the map explicitly maps the key to {@code null}.
         * The {@link #containsKey containsKey} operation may be used to
         * distinguish these two cases.
         */
        public V get(Object key) {
            Node<K,V> e;
            if ((e = getNode(hash(key), key)) == null)
                return null;
            if (accessOrder)
                afterNodeAccess(e);        // 增加访问节点后需要进行的操作,如果指定了根据访问顺序控制,则在这里将节点挪到最后
            return e.value;
        }

      

    五、LinkedHashMap 的 remove 操作

      节点的移除使用的是 HashMap 的 remove(Object key) ,移除其实是一样的,只是 LinkedHashMap 在最后需要处理双链表,这里使用的是扩展了 afterNodeRemoval(Node<K,V> p) 来进行处理。这个方法在 LinkedHashMap 的实现可以翻看本文前面的介绍。

  • 相关阅读:
    第二十天笔记
    第十九天笔记
    第十七天笔记
    第十五天笔记
    第十六天笔记
    第十二天笔记
    数字三角形
    最大子段和与最大子矩阵和
    分组背包
    二维背包
  • 原文地址:https://www.cnblogs.com/wpbxin/p/12185050.html
Copyright © 2011-2022 走看看