最近在学习多线程编程的时候知道了HashMap是线程安全的,而ConcurrentHashMap是线程不安全的,所以在多线程并发的情况下应该使用ConcurrentHashMap来确保线程安全。话虽这么说,而然耳听为虚,眼见为实,如果不能通过代码复现一下HashMap在多线程条件下失效的场景,很难直接去相信这个结论的。于是在网上去查了查能够复现HashMap线程不安全的例子,但是都不能很好复现出这个场景,思考再三,还是决定自己去尝试用代码复现一下这个问题。
1.准备一个用作map的key值的实体类,并重写equals方法和hashcode()方法
public class Entity {
private int a;
private int b;
private int c;
public Entity(int a, int b, int c) {
this.a = a;
this.b = b;
this.c = c;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Entity entity = (Entity) o;
return a == entity.a &&
b == entity.b && c == entity.c;
}
// 重写hashcode方法,仅取前两个成员变量计算hashCode值
@Override
public int hashCode() {
return Objects.hash(a, b);
}
}
public static void main(String[] args) throws InterruptedException {
Entity entity1 = new Entity(1, 1, 2);
Entity entity2 = new Entity(1, 1, 3);
System.out.println(entity1.equals(entity2)); // false
System.out.println(entity1.hashCode() == entity2.hashCode()); // true
}
通过设计一个简单的实体类并重写其equals方法和hashcode方法,最终能够实现通过一个类的两个对象并不equals但是这两个对象的hashcode值却一样的目的,而这也是为了之后HashMap做put()操作时创造hashcode冲突做准备。
2. 使用多线程并发执行HashMap和ConcurrentHashMap的put()方法
public static void main(String[] args) throws InterruptedException {
Entity entity1 = new Entity(1, 1, 2);
Entity entity2 = new Entity(1, 1, 3);
HashMap map = new HashMap<>();
ForkJoinPool forkJoinPool = new ForkJoinPool(10);
forkJoinPool.execute(() -> IntStream.rangeClosed(0, 10).parallel().forEach(i -> {
for (int j = 0; j < 10000; j++) {
Entity entity = new Entity(1, 1, j);
map.put(entity, i);
}
}));
// 等待所有任务完成
forkJoinPool.shutdown();
forkJoinPool.awaitTermination(1, TimeUnit.HOURS);
System.out.println(map.size());
}
输出结果
Exception in thread "ForkJoinPool-1-worker-9" java.lang.ClassCastException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598)
at java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:677)
at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:735)
at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160)
at java.util.stream.ForEachOps$ForEachOp$OfInt.evaluateParallel(ForEachOps.java:189)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
at java.util.stream.IntPipeline.forEach(IntPipeline.java:405)
at java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:562)
at thread.ch3.case01.HashMapTest.lambda$main$1(HashMapTest.java:21)
at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
at java.util.HashMap$TreeNode.putTreeVal(HashMap.java:2014)
at java.util.HashMap.putVal(HashMap.java:638)
at java.util.HashMap.put(HashMap.java:612)
at thread.ch3.case01.HashMapTest.lambda$null$0(HashMapTest.java:24)
at java.util.stream.ForEachOps$ForEachOp$OfInt.accept(ForEachOps.java:205)
at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110)
at java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:693)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)
at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
... 4 more
18833
通过结果可以看出,我们本来期望是用10个线程并发的创建10000个hashcode值相等的entity对象,并将这些对象作为map的key保存到HashMap中,但是结果显示HashMap抛出了异常,而且其内部的key的数量不等于10000,显然这是线程不安全的。那么接下来我们有ConcurrentHashMap试一下。
public static void main(String[] args) throws InterruptedException {
Entity entity1 = new Entity(1, 1, 2);
Entity entity2 = new Entity(1, 1, 3);
ConcurrentHashMap map = new ConcurrentHashMap();
ForkJoinPool forkJoinPool = new ForkJoinPool(10);
forkJoinPool.execute(() -> IntStream.rangeClosed(0, 10).parallel().forEach(i -> {
for (int j = 0; j < 10000; j++) {
Entity entity = new Entity(1, 1, j);
map.put(entity, i);
}
}));
// 等待所有任务完成
forkJoinPool.shutdown();
forkJoinPool.awaitTermination(1, TimeUnit.HOURS);
System.out.println(map.size());
}
输出结果
10000
Process finished with exit code 0
显然采用ConcurrentHashMap之后线程安全执行,且ConcurrentHashMap的size符合我们的预期,由此可见HashMap是线程不安全,ConcurrentHashMap是线程安全的结论是正确的。那么这是为什么呢?
3. 分析HashMap和ConcurrentHashMap的源码
- HashMap的put()方法源码
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
// 根据hash值计算节点在hash表的位置并将这个位置上的元素复制给p, 如果为空则New一个node节点放在这个位置
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}
对于这段源码,重点看一下写了备注的那一行,在我们的例子代码中,由于所有的key的hashcode值都一致,故所有的节点都会在添加在hash表的同一个位置上。假如该位置初始为空,线程1和线程2同时执行到这行代码,那么线程1和线程2各自会new一个新的节点node1和node2,且其next指针都指向null。如果线程1先于线程2执行,那么node2就会覆盖掉node1,从而使得HashMap丢失值。
- ConcurrentHashMap的put()方法
final V putVal(K key, V value, boolean onlyIfAbsent) {
if (key == null || value == null) throw new NullPointerException();
int hash = spread(key.hashCode());
int binCount = 0;
for (Node<K,V>[] tab = table;;) {
Node<K,V> f; int n, i, fh;
if (tab == null || (n = tab.length) == 0)
tab = initTable();
else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
// 在new新的Node时通过CAS来保障线程的安全性
if (casTabAt(tab, i, null,
new Node<K,V>(hash, key, value, null)))
break; // no lock when adding to empty bin
}
else if ((fh = f.hash) == MOVED)
tab = helpTransfer(tab, f);
else {
V oldVal = null;
synchronized (f) {
if (tabAt(tab, i) == f) {
if (fh >= 0) {
binCount = 1;
for (Node<K,V> e = f;; ++binCount) {
K ek;
if (e.hash == hash &&
((ek = e.key) == key ||
(ek != null && key.equals(ek)))) {
oldVal = e.val;
if (!onlyIfAbsent)
e.val = value;
break;
}
Node<K,V> pred = e;
if ((e = e.next) == null) {
pred.next = new Node<K,V>(hash, key,
value, null);
break;
}
}
}
else if (f instanceof TreeBin) {
Node<K,V> p;
binCount = 2;
if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
value)) != null) {
oldVal = p.val;
if (!onlyIfAbsent)
p.val = value;
}
}
}
}
if (binCount != 0) {
if (binCount >= TREEIFY_THRESHOLD)
treeifyBin(tab, i);
if (oldVal != null)
return oldVal;
break;
}
}
}
addCount(1L, binCount);
return null;
}
同样看有备注行的代码,与HashMap不同的是,ConcurrentHashMap在new一个新的节点时,其通过调用CAS的方法来保障线程的安全性,所以在使用ConcurrentHashMap的put()方法时并不会产生线程安全的问题。
本文只是尝试通过一个简单的例子来复现HashMap的线程安全性问题,至于更深层次的原理之后可能会重新写一篇来进行分析。