jdk自带的ThreadLocal和netty扩展的FastThreadLocal比较总结

zoukankan html css js c++ java

jdk自带的ThreadLocal和netty扩展的FastThreadLocal比较总结

最近在分析一潜在内存泄露问题的时候，jmap出来中有很多的FastThreadLocalThread实例，看了下javadoc，如下：

A special variant of ThreadLocal that yields higher access performance when accessed from a FastThreadLocalThread.

Internally, a FastThreadLocal uses a constant index in an array, instead of using hash code and hash table, to look for a variable. Although seemingly very subtle, it yields slight performance advantage over using a hash table, and it is useful when accessed frequently.

To take advantage of this thread-local variable, your thread must be a FastThreadLocalThread or its subtype. By default, all threads created by DefaultThreadFactory are FastThreadLocalThread due to this reason.

Note that the fast path is only possible on threads that extend FastThreadLocalThread, because it requires a special field to store the necessary state. An access by any other kind of thread falls back to a regular ThreadLocal.

简单地说，就是在FastThreadLocalThread线程内访问性能会更快的ThreadLocal的一种实现。其使用常量索引而非hash值作为索引进行变量查找。

对于使用默认线程池的情况，netty会使用DefaultTrheadFactory创建FastThreadLocalThread线程，而非原生的Thread，其源码位置如下：

根据之前对比java测试c++各种map、unordered_map的记忆，一般来说map中值越多、各种实现的差距越大（因为潜在的冲突增加以及底层的实现为b*或者链表或者线性等）。

为了大概了解下差距会有多少，搜了下，有个帖子（https://my.oschina.net/andylucc/blog/614359）进行了测试，例子中结果如下：

1000个ThreadLocal对应一个线程对象的100w次的计时读操作：

ThreadLocal：3767ms | 3636ms | 3595ms | 3610ms | 3719ms

FastThreadLocal: 15ms | 14ms | 13ms | 14ms | 14ms

1000个ThreadLocal对应一个线程对象的10w次的计时读操作：

ThreadLocal：384ms | 378ms | 366ms | 647ms | 372ms

FastThreadLocal:14ms | 13ms | 13ms | 17ms | 13ms

1000个ThreadLocal对应一个线程对象的1w次的计时读操作：

ThreadLocal：43ms | 42ms | 42ms | 56ms | 45ms

FastThreadLocal:15ms | 13ms | 11ms | 15ms | 11ms

100个ThreadLocal对应一个线程对象的1w次的计时读操作：

ThreadLocal：16ms | 21ms | 18ms | 16ms | 18ms

FastThreadLocal:15ms | 15ms | 15ms | 17ms | 18ms

上面的实验数据可以看出，当ThreadLocal数量和读写ThreadLocal的频率较高的时候，传统的ThreadLocal的性能下降速度比较快，而Netty实现的FastThreadLocal性能比较稳定。上面实验模拟的场景不够具体，但是已经在一定程度上我们可以认为，FastThreadLocal相比传统的的ThreadLocal在高并发高负载环境下表现的比较优秀。

总结来说，根据经验，个人认为99%的应用中不会使用超过成千上万个线程本地变量，所以除非极为特殊的应用，出于后续维护成本的考虑，使用传统的ThreadLocal就可以了，没必要使用FastThreadLocal。

PS：关于threadlocal的场景，就不重复阐述了，可参考下列两个帖子：

https://my.oschina.net/clopopo/blog/149368

http://blog.csdn.net/lufeng20/article/details/24314381

http://lavasoft.blog.51cto.com/62575/51926/

查看全文

相关阅读:
Mockito测试
 linux笔记：shell编程-正则表达式
 linux笔记：shell基础-环境变量配置文件
 linux笔记：shell基础-bash变量
 linux笔记：shell基础-bash基本功能
 linux笔记：shell基础-概述和脚本执行方式
 linux笔记：文件系统管理-fdisk分区
 linux笔记：文件系统管理-分区、文件系统以及文件系统常用命令
 linux笔记：权限管理-sudo
linux笔记：用户和用户组管理-用户管理命令

原文地址：https://www.cnblogs.com/zhjh256/p/6367928.html