CDS类提前初始化
CDS介绍
传统CDS[0]分为Dump和Use两个大阶段:
-Xshare:off -XX:DumpLoadedClassList=test.log
-Xshare:dump -XX:SharedClassListFile=test.log -XX:SharedArchiveFile=test.jsa
-Xshare:on -XX:SharedArchiveFile=test.jsa
实际上为了使用CDS,一般还是需要三步。第一步产生一个包含类名字的列表test.log
,第二步根据类名字列表产生CDS archive,第三步使用CDS Archive。可以搜索DumpSharedSpaces和UseSharedSpaces两个flag来找到后面两步的代码和逻辑。
CDS Dump完整流程位于MetaspaceShared::preload_and_dump
,它先做一些初始化工作,比如读取classlist的类,并预加载,然后VMThread::execute
一个VM_PopulateDumpSharedSpace做实际dump archive工作:
void MetaspaceShared::preload_and_dump(TRAPS) {
{ TraceTime timer("Dump Shared Spaces", TRACETIME_LOG(Info, startuptime));
ResourceMark rm;
char class_list_path_str[JVM_MAXPATHLEN];
// Preload classes to be shared.
// Should use some os:: method rather than fopen() here. aB.
const char* class_list_path;
if (SharedClassListFile == NULL) {
// Construct the path to the class list (in jre/lib)
// Walk up two directories from the location of the VM and
// optionally tack on "lib" (depending on platform)
os::jvm_path(class_list_path_str, sizeof(class_list_path_str));
for (int i = 0; i < 3; i++) {
char *end = strrchr(class_list_path_str, *os::file_separator());
if (end != NULL) *end = ' ';
}
int class_list_path_len = (int)strlen(class_list_path_str);
if (class_list_path_len >= 3) {
if (strcmp(class_list_path_str + class_list_path_len - 3, "lib") != 0) {
if (class_list_path_len < JVM_MAXPATHLEN - 4) {
jio_snprintf(class_list_path_str + class_list_path_len,
sizeof(class_list_path_str) - class_list_path_len,
"%slib", os::file_separator());
class_list_path_len += 4;
}
}
}
if (class_list_path_len < JVM_MAXPATHLEN - 10) {
jio_snprintf(class_list_path_str + class_list_path_len,
sizeof(class_list_path_str) - class_list_path_len,
"%sclasslist", os::file_separator());
}
class_list_path = class_list_path_str;
} else {
class_list_path = SharedClassListFile;
}
tty->print_cr("Loading classes to share ...");
_has_error_classes = false;
int class_count = preload_classes(class_list_path, THREAD);
if (ExtraSharedClassListFile) {
class_count += preload_classes(ExtraSharedClassListFile, THREAD);
}
tty->print_cr("Loading classes to share: done.");
log_info(cds)("Shared spaces: preloaded %d classes", class_count);
// Rewrite and link classes
tty->print_cr("Rewriting and linking classes ...");
// Link any classes which got missed. This would happen if we have loaded classes that
// were not explicitly specified in the classlist. E.g., if an interface implemented by class K
// fails verification, all other interfaces that were not specified in the classlist but
// are implemented by K are not verified.
link_and_cleanup_shared_classes(CATCH);
tty->print_cr("Rewriting and linking classes: done");
SystemDictionary::clear_invoke_method_table();
HeapShared::init_archivable_static_fields(THREAD);
VM_PopulateDumpSharedSpace op;
VMThread::execute(&op);
}
}
以上是CDS Dump的大流程。
CDS字段提前初始化
具体到CDS字段提前初始化主题,这是一个实验尝试,大前提是JDK12的https://wiki.openjdk.java.net/display/HotSpot/Caching+Java+Heap+Objects,有了这个技术才有下文。
现存JDK做了一些工作,它在CDS Dump阶段将一些硬编码的类字段dump到cds archive,也就是上面的HeapShared::init_archivable_static_fields
:
struct ArchivableStaticFieldInfo {
const char* klass_name;
const char* field_name;
InstanceKlass* klass;
int offset;
BasicType type;
};
// If you add new entries to this table, you should know what you're doing!
static ArchivableStaticFieldInfo archivable_static_fields[] = {
{"jdk/internal/module/ArchivedModuleGraph", "archivedSystemModules"},
{"jdk/internal/module/ArchivedModuleGraph", "archivedModuleFinder"},
{"jdk/internal/module/ArchivedModuleGraph", "archivedMainModule"},
{"jdk/internal/module/ArchivedModuleGraph", "archivedConfiguration"},
{"java/util/ImmutableCollections$ListN", "EMPTY_LIST"},
{"java/util/ImmutableCollections$MapN", "EMPTY_MAP"},
{"java/util/ImmutableCollections$SetN", "EMPTY_SET"},
{"java/lang/Integer$IntegerCache", "archivedCache"},
{"java/lang/module/Configuration", "EMPTY_CONFIGURATION"},
};
const static int num_archivable_static_fields =
sizeof(archivable_static_fields) / sizeof(ArchivableStaticFieldInfo);
class ArchivableStaticFieldFinder: public FieldClosure {
InstanceKlass* _ik;
Symbol* _field_name;
bool _found;
int _offset;
public:
ArchivableStaticFieldFinder(InstanceKlass* ik, Symbol* field_name) :
_ik(ik), _field_name(field_name), _found(false), _offset(-1) {}
virtual void do_field(fieldDescriptor* fd) {
if (fd->name() == _field_name) {
assert(!_found, "fields cannot be overloaded");
assert(fd->field_type() == T_OBJECT || fd->field_type() == T_ARRAY, "can archive only obj or array fields");
_found = true;
_offset = fd->offset();
}
}
bool found() { return _found; }
int offset() { return _offset; }
};
void HeapShared::init_archivable_static_fields(Thread* THREAD) {
for (int i = 0; i < num_archivable_static_fields; i++) {
ArchivableStaticFieldInfo* info = &archivable_static_fields[i];
TempNewSymbol klass_name = SymbolTable::new_symbol(info->klass_name, THREAD);
TempNewSymbol field_name = SymbolTable::new_symbol(info->field_name, THREAD);
Klass* k = SystemDictionary::resolve_or_null(klass_name, THREAD);
assert(k != NULL && !HAS_PENDING_EXCEPTION, "class must exist");
InstanceKlass* ik = InstanceKlass::cast(k);
ArchivableStaticFieldFinder finder(ik, field_name);
ik->do_local_static_fields(&finder);
assert(finder.found(), "field must exist");
info->klass = ik;
info->offset = finder.offset();
}
}
void HeapShared::archive_static_fields(Thread* THREAD) {
// For each class X that has one or more archived fields:
// [1] Dump the subgraph of each archived field
// [2] Create a list of all the class of the objects that can be reached
// by any of these static fields.
// At runtime, these classes are initialized before X's archived fields
// are restored by HeapShared::initialize_from_archived_subgraph().
int i;
for (i = 0; i < num_archivable_static_fields; ) {
ArchivableStaticFieldInfo* info = &archivable_static_fields[i];
const char* klass_name = info->klass_name;
start_recording_subgraph(info->klass, klass_name);
// If you have specified consecutive fields of the same klass in
// archivable_static_fields[], these will be archived in the same
// {start_recording_subgraph ... done_recording_subgraph} pass to
// save time.
for (; i < num_archivable_static_fields; i++) {
ArchivableStaticFieldInfo* f = &archivable_static_fields[i];
if (f->klass_name != klass_name) {
break;
}
archive_reachable_objects_from_static_field(f->klass, f->klass_name,
f->offset, f->field_name, CHECK);
}
done_recording_subgraph(info->klass, klass_name);
}
log_info(cds, heap)("Performed subgraph records = %d times", _num_total_subgraph_recordings);
log_info(cds, heap)("Walked %d objects", _num_total_walked_objs);
log_info(cds, heap)("Archived %d objects", _num_total_archived_objs);
log_info(cds, heap)("Recorded %d klasses", _num_total_recorded_klasses);
#ifndef PRODUCT
for (int i = 0; i < num_archivable_static_fields; i++) {
ArchivableStaticFieldInfo* f = &archivable_static_fields[i];
verify_subgraph_from_static_field(f->klass, f->offset);
}
log_info(cds, heap)("Verified %d references", _num_total_verifications);
#endif
}
其中HeapShared::init_archivable_static_fields
在初始化阶段完成,HeapShared::archive_static_fields
在VMThread执行VM_PopulateDumpSharedSpace时完成,两个过程完成后这些硬编码的字段可以一并dump到cds archive,如果运行时遇到jdk.internal.misc.VM.initializeFromArchive
,它会调HeapShared::initialize_from_archived_subgraph
,该函数从cds archive中直接加载这些字段数据,避免了解释器解释执行初始化相关字段的过程。
举个例子,上面的硬编码代码把java/lang/Integer$IntegerCache
的archivedCache字段dump到了cds archive。这个IntegerCache代码如下:
private static class IntegerCache {
static final int low = -128;
static final int high;
static final Integer[] cache;
static Integer[] archivedCache;
static {
// high value may be configured by property
int h = 127;
String integerCacheHighPropValue =
VM.getSavedProperty("java.lang.Integer.IntegerCache.high");
if (integerCacheHighPropValue != null) {
try {
int i = parseInt(integerCacheHighPropValue);
i = Math.max(i, 127);
// Maximum array size is Integer.MAX_VALUE
h = Math.min(i, Integer.MAX_VALUE - (-low) -1);
} catch( NumberFormatException nfe) {
// If the property cannot be parsed into an int, ignore it.
}
}
high = h;
// Load IntegerCache.archivedCache from archive, if possible
VM.initializeFromArchive(IntegerCache.class);
int size = (high - low) + 1;
// Use the archived cache if it exists and is large enough
if (archivedCache == null || size > archivedCache.length) {
Integer[] c = new Integer[size];
int j = low;
for(int k = 0; k < c.length; k++)
c[k] = new Integer(j++);
archivedCache = c;
}
cache = archivedCache;
// range [-128, 127] must be interned (JLS7 5.1.7)
assert IntegerCache.high >= 127;
}
private IntegerCache() {}
}
然后static块里面用VM.initializeFromArchive(IntegerCache.class)
从cds archive加载这个字段,免去了下面解释执行创建archiveCache数组的开销。
但是问题也很明显,现在只能将可以放到cds archive的字段硬编码到HotSpot VM的实现中,同时JDK需要调用一个函数,缺乏灵活性,应用程序代码无法使用这个技术。
CDS类提前初始化
如果只是硬编码提前初始化几个字段那就太狭隘了,实际上这个技术可以进一步扩展,也就是本章的主题,类的提前初始化。关于这点zhoujiangli的文档提案[1],[2]中有更详细的报告。
报告给的方案是使用一个新注解@Preserve,@Preserve标注的类表示可以进行类提前初始化。效果如图
现存方案需要VM.initializeFromArchive
和HotSpotVM源码的全力配合,但是有了@Preserve
后,VM能识别他,同时也无须调用VM.initializeFromArchive
。
基于@Preserve
方案,整个CDS+运行时流程如下:
- 类初始化阶段
cds dump时候,需要dump的类已经链接,这个在MetaspaceShared::preload_and_dump
调用preload_classes时完成,其中有一部分类已经初始化,这些不用管。如果发现标注了@Preserve
但是还没有初始化的类,那么显式的初始化它。 - subgraph检查阶段
java堆对象archiving过程完成后,所有从static字段直接或者间接可达的对象会构成一幅subgraph。这个阶段遍历subgraph检查,如果发现subgraph有如下类型则不能archive:
• non-mirror java.lang.Class对象
• ClassLoader对象
• java.security.ProtectionDomain对象
• java.lang.Thread对象
• Runnable对象
• java.io.File对象
• TBD - 静态字段值保留阶段
检查完成后,将static字段archive到archive mirror object,不用运行时再调VM.initializeFromArchive
。 - 运行时处理标注了@Preserve的类阶段
运行时有多种情况,但是不管哪种都不需要再调clinit。
引用
0] https://openjdk.java.net/jeps/310
[1] http://cr.openjdk.java.net/~jiangli/Leyden/Java Class Pre-resolution and Pre-initialization (OpenJDK).pdf
[2] http://cr.openjdk.java.net/~jiangli/Leyden/Selectively Pre-initializing and Preserving Java Classes (OpenJDK).pdf