zoukankan html css js c++ java

Hashmap 实现方式 jdk1.7 和 1.8区别

hashmap 是很常用的一种集合框架，其底层实现方式在 jdk1.7和 jdk1.8中却有很大区别，今天我们通过看源码的方式来研究下它们之间的区别。

hashmap 是用来存储数据的，它底层数据结构是数组，数组中元素是链表或红黑树，通过对 key 进行哈希计算等操作后得到数组下标，把 value 等信息放在链表或红黑树存在此位置。如果两个不同的 key 运算后获取的数组下标一致，就出现了哈希冲突。数组默认长度是16，如果实际数组长度超过一定的值，就会进行扩容。在我看来，1.7和1.8主要在处理哈希冲突和扩容问题上区别比较大。

首先看下 jdk1.7

存放数据的数组

put 方法源码，我都加了注释

 public V put(K key, V value) {
　　　　　//数组为空就进行初始化
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);
　　　　　//key 进行哈希计算
        int hash = hash(key);
　　　　　//获取数组下标
        int i = indexFor(hash, table.length);
　　　　　//如果此下标有值，遍历链表上的元素，key 一致的话就替换 value 的值
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
　　　　　//新增一个key
        addEntry(hash, key, value, i);
        return null;
    }

addEntry源码

 void addEntry(int hash, K key, V value, int bucketIndex) {
　　　　　//数组长度大于阈值且存在哈希冲突（即当前数组下标有元素），就将数组扩容至2倍
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }
        createEntry(hash, key, value, bucketIndex);
    }

继续看 createEntry 源码

void createEntry(int hash, K key, V value, int bucketIndex) {
　　　　　//此位置有元素，就在链表头部插入新元素（头插法）
        Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }

这里可以看到 jdk 1.7扩容的条件是数组长度大于阈值且存在哈希冲突，由此我们可以想象，默认长度为16的情况下，数组最多可以存27个元素后才扩容，原因是在一个下标存储12个元素后（阈值为12），在剩下的15个下标各存一个元素，最多就可存27个元素，当然这种是很偶然的情况。不过也可以看到 jdk1.7 中，这个阈值的作用并不是特别的大，并不是超过阈值就一定会扩容。

下面来看看 jdk1.8 的源码

存放数据的数组

这里 hash算法发生了变化，不过这不是重点，我们继续看下 put 的源码

public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

putVal 源码

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
　　　　　//数组为空就初始化
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
　　　　　//当前下标为空，就直接插入
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
　　　　　　　//key 相同就覆盖原来的值
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
　　　　　　　//树节点插入数据
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
　　　　　　　　　　　　//链表，尾插法插入数据
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
　　　　　　　　　　　　　　//链表长度超过8，就把链表转为红黑树
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
　　　　　　　　　　　　//key相同就覆盖原来的值
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
　　　　　//数组长度大于阈值，就扩容
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

继续看下 treeifyBin 的源码

final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
　　　　　//链表转为红黑树时，若此时数组长度小于64，扩容数组
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        else if ((e = tab[index = (n - 1) & hash]) != null) {
            TreeNode<K,V> hd = null, tl = null;
　　　　　　　//链表转为树结构
            do {
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            if ((tab[index] = hd) != null)
                hd.treeify(tab);
        }
    }

由此可以看到1.8中，数组有两种情况会发生扩容，一种是超过阈值，一种是链表转为红黑树且数组元素小于64时，由此在jdk1.8中，默认长度为16情况下，要么元素一直放在同一下标，数组长度为9时就会扩容，要么超过阈值12时才会扩容。

通过上面的分析，我们可以看到jdk1.7和1.8情况下 hashmap实现方式的主要区别

1. 出现哈希冲突时，1.7把数据存放在链表，1.8是先放在链表，链表长度超过8就转成红黑树

2. 1.7扩容条件是数组长度大于阈值且存在哈希冲突，1.8扩容条件是数组长度大于阈值或链表转为红黑树且数组元素小于64时

这篇文章我只是大概分析下 hashmap 在两个jdk版本中实现方式的差异，很多如链表怎么转红黑树的，怎么扩容的细节没有很清楚的说明，主要这部分也涉及到数据结构的内容，我对这方面了解的还不够透彻。但之所以链表要转成红黑树，还是为了解决存取效率的问题。链表过长，取数据的效率就很慢，红黑树插入比较慢，但取数据还是很快的。

使用 hashmap 时，一开始最好指定下长度，毕竟扩容时，需要重新根据 key 计算数组下标，还是很影响效率的。

查看全文

相关阅读:
devexpress GridView按条件给行号上色
 简答正则表达式的使用
 devexpress 给GridView添加行号
 C# Winfrom 简单的运用Timer控件
 C# 简单TCP协议
 Lucene全文检索-从零开始（3）
Lucene全文检索-从零开始（2）
Lucene全文检索-从零开始（1）
js 高级编程前三章
 莫名的胸闷

原文地址：https://www.cnblogs.com/fightingting/p/11655875.html