zoukankan      html  css  js  c++  java
  • jvm源码解析java对象头

      认真学习过java的同学应该都知道,java对象由三个部分组成:对象头,实例数据,对齐填充,这三大部分扛起了java的大旗对象,实例数据其实就是我们对象中的数据,对齐填充是由于为了规则分配内存空间,java对象大小一定是8字节的整数倍,但是我们也不能让程序员来控制吧,所以当不够8位时,会自动填充至8的整数倍,对象头记录了hash值,gc年龄,锁状态(偏向锁还会记录线程id),gc状态等等,它还保存了对象的class指针,可谓是核心中的核心,有兴趣的同学可以去看一下关于我写的对象的一些介绍:https://www.cnblogs.com/gmt-hao/p/13817564.html。那么接下来我们就从jvm层面来剖析对象头的实现,还是老规矩,先撸代码。

      java作为面向对象的语言,作为代表的对象原始类名称也很有代表性:oop,我们进oop.hpp中看一下:

    // oopDesc is the top baseclass for objects classes.  The {name}Desc classes describe
    // the format of Java objects so the fields can be accessed from C++.
    // oopDesc is abstract.
    // (see oopHierarchy for complete oop class hierarchy)
    //
    // no virtual functions allowed
    ...省略
    class oopDesc {
      friend class VMStructs;
     private:
      volatile markOop  _mark;
      union _metadata {
        Klass*      _klass;
        narrowKlass _compressed_klass;
      } _metadata;

    先看一下注释,oopDesc代表所有object对象的最上层基类,至于后面一句我理解的话其实这一块的意思就是说用c++中的字段定义java对象的格式,,再看下面定义的几个字段,_mark 就是mark world,而_metadata里面有俩属性, _klass和_compressed_klass,前者就是正常的指针,而后者是压缩指针,压缩指针在1.8默认开启,可以通过-XX:-UseCompressedOops关闭,这里就不做详细赘述,反正记住都是class指针,指向具体的klass就行了,先看Klass的注释

    // A Klass provides:                                         
    // 1: language level class object (method dictionary etc.)
    // 2: provide vm dispatch behavior for the object
    // Both functions are combined into one C++ class.
    这段话的意思是Klass提供了语言级别的类对象(如方法,字典表等),vm调度行为再一个c++ 类里面

    // One reason for the oop/klass dichotomy in the implementation is
    // that we don't want a C++ vtbl pointer in every object. Thus,
    // normal oops don't have any virtual functions. Instead, they
    // forward all "virtual" functions to their klass, which does have
    // a vtbl and does the C++ dispatch depending on the object's
    // actual type. (See oop.inline.hpp for some of the forwarding code.)
    // ALL FUNCTIONS IMPLEMENTING THIS DISPATCH ARE PREFIXED WITH "oop_"!
    这段话的意思大致是解释为什么要把klass 和 对象实体分成两部分来实现,他说不希望一个c++的虚方法指针存放在每个对象中,从而普通的对象不存放任何虚方法,有着虚方法的klass可以根据对象的实际类型进行c++的调度。
    现在我大概是明白了,这不就是多态吗,原来多态的实现是这么玩的,在编译时期,对象是不知道自己具体调用的方法的,而在实际运行时去klass中去找实际类型调用对应方法。
    我们再看一下实际类加载的klass子类InstanceKlass:
    class InstanceKlass: public Klass {
      friend class VMStructs;
      friend class ClassFileParser;
      friend class CompileReplay;
    
     protected:
      // Constructor  构造函数
    InstanceKlass(int vtable_len,             //虚方法表大小
                    int itable_len,             //接口函数表大小
                    int static_field_size,      //静态变量个数
                    int nonstatic_oop_map_size, //非静态变量个数
                    ReferenceType rt,           //引用类型
                    AccessFlags access_flags,   //当前类的访问修饰符(public private)
                    bool is_anonymous);         //是否匿名
    。。。。。。。
    
     // See "The Java Virtual Machine Specification" section 2.16.2-5 for a detailed description
      // of the class loading & initialization procedure, and the use of the states.
      enum ClassState {
        allocated,                          // allocated (but not yet linked)
        loaded,                             // loaded and inserted in class hierarchy (but not linked yet)
        linked,                             // successfully linked/verified (but not initialized yet)
        being_initialized,                  // currently running class initializer
        fully_initialized,                  // initialized (successfull final state)
        initialization_error                // error happened during initialization
      };
    
    protected: // Annotations for this class 类注解信息 Annotations* _annotations; // Array classes holding elements of this class. Klass* _array_klasses; // Constant pool for this class. ConstantPool* _constants; // The InnerClasses attribute and EnclosingMethod attribute. The // _inner_classes is an array of shorts. If the class has InnerClasses // attribute, then the _inner_classes array begins with 4-tuples of shorts // [inner_class_info_index, outer_class_info_index, // inner_name_index, inner_class_access_flags] for the InnerClasses // attribute. If the EnclosingMethod attribute exists, it occupies the // last two shorts [class_index, method_index] of the array. If only // the InnerClasses attribute exists, the _inner_classes array length is // number_of_inner_classes * 4. If the class has both InnerClasses // and EnclosingMethod attributes the _inner_classes array length is // number_of_inner_classes * 4 + enclosing_method_attribute_size. Array<jushort>* _inner_classes; // the source debug extension for this klass, NULL if not specified. // Specified as UTF-8 string without terminating zero byte in the classfile, // it is stored in the instanceklass as a NULL-terminated UTF-8 string char* _source_debug_extension; // Array name derived from this class which needs unreferencing // if this class is unloaded. Symbol* _array_name; // Number of heapOopSize words used by non-static fields in this klass // (including inherited fields but after header_size()). int _nonstatic_field_size; int _static_field_size; // number words used by static fields (oop and non-oop) in this klass // Constant pool index to the utf8 entry of the Generic signature, // or 0 if none. u2 _generic_signature_index; // Constant pool index to the utf8 entry for the name of source file // containing this klass, 0 if not specified. u2 _source_file_name_index; u2 _static_oop_field_count;// number of static oop fields in this klass u2 _java_fields_count; // The number of declared Java fields int _nonstatic_oop_map_size;// size in words of nonstatic oop map blocks // _is_marked_dependent can be set concurrently, thus cannot be part of the // _misc_flags. bool _is_marked_dependent; // used for marking during flushing and deoptimization

    可以看到初始化的Klass的构造方法包含了像虚函数表大小,引用类型等等基本信息,再往下可以看到这里面字段增加了注解属性,当前常量池中保存的当前类引用,内部类等等。

      说完klass,我们在聊一聊今天的重头戏mark word,我们首先还是先看一下作者的注释:

    The markOop describes the header of an object.
    markOop描述了一个对象头

    //
    // Note that the mark is not a real oop but just a word.
    // It is placed in the oop hierarchy for historical reasons.
    请注意mark只是一个word(32位机器上就是32个字节,64位就是64个字节)而不是一个真实对象,由于一些历史原因他被留在了oop结构中

    //
    // Bit-format of an object header (most significant first, big endian layout below):
    //对象的字节格式采用大端模式(高位字节放低位地址)
    // 32 bits:
    // --------
    // hash:25 ------------>| age:4 biased_lock:1 lock:2 (normal object)
    // JavaThread*:23 epoch:2 age:4 biased_lock:1 lock:2 (biased object)
    // size:32 ------------------------------------------>| (CMS free block)
    // PromotedObject*:29 ---------->| promo_bits:3 ----->| (CMS promoted object)
    //
    // 64 bits:
    // --------
    // unused:25 hash:31 -->| unused:1 age:4 biased_lock:1 lock:2 (normal object)
    // JavaThread*:54 epoch:2 unused:1 age:4 biased_lock:1 lock:2 (biased object)
    // PromotedObject*:61 --------------------->| promo_bits:3 ----->| (CMS promoted object)
    // size:64 ----------------------------------------------------->| (CMS free block)

      第一句就点明了它作为我们这一章的主角地位,markOop描述了一个对象头,好家伙,这个才是真正的对象头,看了一圈网上的文章,基本都是在描述mark word和klass指针之类的,但是没关系,只是定义不同。

      再看下面的字节格式,我们主要看一下64位系统,根据上述提供的我们看一下这4种情况:

      1.未加锁但调用了hash是这样的:

      

       2.加了偏向锁,并偏向指定线程:

           

       3.CMS标记:

      

      4.回收就不谈了,肯定是空的。

      这里其实存在一个问题,可以看到第二种偏向锁的场景是没办法再存hash值的,那难道我加了偏向锁就不能在获取hash值了吗,答案当然是否定的,要分析这个我们先来看一段代码:

    public class Response {
    }
    @Slf4j
    public class TestHeader {
    
        static Response response = new Response();
        public static void aaa(Response response) throws InterruptedException {
            log.info(Thread.currentThread().getName() + "out" +ClassLayout.parseInstance(response).toPrintable());
    
            synchronized (response){
                log.info(Thread.currentThread().getName() + ClassLayout.parseInstance(response).toPrintable());
                sleep(5000);
                log.info(Thread.currentThread().getName());
            }
        }
    
        public static void main(String[] args) throws InterruptedException {
            Thread t1 = new Thread("t1"){
                @SneakyThrows
                @Override
                public void run(){
                    sleep(2000);
                    aaa(response);
                }
            };
            Thread t2 = new Thread("t2"){
                @SneakyThrows
                @Override
                public void run(){
                    aaa(response);
                }
            };
            t1.start();
            t2.start();
            t1.join();
            t2.join();
        }
    }

    这里Response是一个空对象,没有计算hash,我们看打印结果:

    16:03:40.326 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
         12     4        (loss due to the next object alignment)
    Instance size: 16 bytes
    Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
    
    16:03:40.330 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           05 b0 59 1f (00000101 10110000 01011001 00011111) (525971461)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
         12     4        (loss due to the next object alignment)
    Instance size: 16 bytes
    Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
    
    16:03:42.368 [t1] INFO com.example.demo.TestHeader - t1outcom.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           05 b0 59 1f (00000101 10110000 01011001 00011111) (525971461)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
         12     4        (loss due to the next object alignment)
    Instance size: 16 bytes
    Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
    
    16:03:45.331 [t2] INFO com.example.demo.TestHeader - t2
    16:03:45.331 [t1] INFO com.example.demo.TestHeader - t1com.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           ba 16 ee 1c (10111010 00010110 11101110 00011100) (485365434)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)  //klass引用
         12     4        (loss due to the next object alignment)    //对齐填充

    上面的对象头介绍我们可以知道,锁的标识是最后两位,而倒数第三位

      我们在来介绍一下其他几个的含义:age用来记录gc年龄(由于只有4位,最多只能记录到15,因此gc年龄最大也就是15),biased_lock表示偏向锁标识,0关闭,1开启,lock标识锁状态,01偏向锁,00轻量锁,10重量锁,而当被gc标记时,后三位用来表示标记符。

     然后大端模式导致我们显示出来的和想象的不一样,可以看到除了对齐填充和klass就是mark word 一共64个01,8个字节,而这8个字节按倒序排序(前8位所占的字节其实是最后一个字节),所以我们看锁标记直接看标红地方的后三位就可以了。

      我们在来具体分析一下这个代码,两个线程t1和t2,t1启动后等待2秒,t2先跑,拿到锁之后歇5秒,而t1在2秒之后到达,则会进行锁竞争,我们可以看到在t2在第一次拿到锁之后,将线程id记录了下来,而t1过来抢锁之后,则由偏向锁直接升级为重量锁。

      我们再试一下将休眠5s给去掉,看下执行结果:

    16:45:17.873 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
         12     4        (loss due to the next object alignment)
    Instance size: 16 bytes
    Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
    
    16:45:17.876 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           05 48 27 1f (00000101 01001000 00100111 00011111) (522668037)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
         12     4        (loss due to the next object alignment)
    Instance size: 16 bytes
    Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
    
    16:45:17.876 [t2] INFO com.example.demo.TestHeader - t2
    16:45:19.843 [t1] INFO com.example.demo.TestHeader - t1outcom.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           05 48 27 1f (00000101 01001000 00100111 00011111) (522668037)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
         12     4        (loss due to the next object alignment)
    Instance size: 16 bytes
    Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
    
    16:45:19.844 [t1] INFO com.example.demo.TestHeader - t1com.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           f0 f3 ac 1f (11110000 11110011 10101100 00011111) (531428336)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
         12     4        (loss due to the next object alignment)

    前面三次还是一样的,由于t2没有休眠,所以拿完锁直接释放了,而t1休眠2秒过来抢锁,偏向已经撤销,转为轻量锁00了。

    我们再看一下刚才说的hashCode的情况:

    public class TestHeader {
    
        static Response response = new Response();
        public static void aaa(Response response) throws InterruptedException {
            log.info(Thread.currentThread().getName() + "out" +ClassLayout.parseInstance(response).toPrintable());
            response.hashCode();
            log.info(Thread.currentThread().getName() + "hash" +ClassLayout.parseInstance(response).toPrintable());
            synchronized (response){
                log.info(Thread.currentThread().getName() + ClassLayout.parseInstance(response).toPrintable());
    //            sleep(5000);
            }
        }
    
        public static void main(String[] args) throws InterruptedException {
            Thread t2 = new Thread("t2"){
                @SneakyThrows
                @Override
                public void run(){
                    aaa(response);
                }
            };
            t2.start();
            t2.join();
        }

    这里只启动了一个线程,分别在hash计算前,计算后和加锁后打印:

    16:50:19.440 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
         12     4        (loss due to the next object alignment)
    Instance size: 16 bytes
    Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
    
    16:50:19.443 [t2] INFO com.example.demo.TestHeader - t2hashcom.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           01 63 bb 3f (00000001 01100011 10111011 00111111) (1069245185)
          4     4        (object header)                           50 00 00 00 (01010000 00000000 00000000 00000000) (80)
          8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
         12     4        (loss due to the next object alignment)
    Instance size: 16 bytes
    Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
    
    16:50:19.444 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           10 ee 1e 1f (00010000 11101110 00011110 00011111) (522120720)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
         12     4        (loss due to the next object alignment)

    可以看到第一次就是常规的匿名可偏向,而计算完hash之后,变为不可偏向,并计算了hash值,加锁之后也不再是偏向锁,而是直接变为了轻量锁并保存线程id,再看一下,如果已经偏向某个线程后在调用hashCode的结果:

    public class TestHeader {
    
        static Response response = new Response();
        public static void aaa(Response response) throws InterruptedException {
            log.info(Thread.currentThread().getName() + "out" +ClassLayout.parseInstance(response).toPrintable());
    
            synchronized (response){
                log.info(Thread.currentThread().getName() + ClassLayout.parseInstance(response).toPrintable());
                response.hashCode();
                log.info(Thread.currentThread().getName() + "hash" +ClassLayout.parseInstance(response).toPrintable());
    //            sleep(5000);
            }
        }
    
        public static void main(String[] args) throws InterruptedException {
            Thread t2 = new Thread("t2"){
                @SneakyThrows
                @Override
                public void run(){
                    aaa(response);
                }
            };
            t2.start();
            t2.join();
        }
    }

    执行结果:

    16:59:12.601 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           9f c1 00 f8 (10011111 11000001 00000000 11111000) (-134168161)
         12     4        (loss due to the next object alignment)
    Instance size: 16 bytes
    Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
    
    16:59:12.604 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           05 68 40 1f (00000101 01101000 01000000 00011111) (524314629)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           9f c1 00 f8 (10011111 11000001 00000000 11111000) (-134168161)
         12     4        (loss due to the next object alignment)
    Instance size: 16 bytes
    Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
    
    16:59:12.604 [t2] INFO com.example.demo.TestHeader - t2hashcom.example.demo.Response object internals:
     OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
          0     4        (object header)                           2a 12 d4 1c (00101010 00010010 11010100 00011100) (483660330)
          4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
          8     4        (object header)                           9f c1 00 f8 (10011111 11000001 00000000 11111000) (-134168161)
         12     4        (loss due to the next object alignment)

    可以看到由偏向锁直接升级为重量锁(10)。

    总结:

      对象头其实在我看来就是一个死的概念,更多的时在gc或者是锁甚至是以后其他的操作,在jdk源码和jvm中看到了很多对于一个int值或者其他多字节的字段进行拆解操作,比如像jdk中的读写锁,便是用高低位分别表示,,而像这里也是用了一个word表示出那么多的花样,这一篇本来是不打算写的,但是当我要写synchronized的源码分析时,写了一小段突然发现卡壳了,完全没有办法绕开它,不过这也说明了对象头的重要性吧。

      对于锁的升级,从上面的例子也可以看出默认情况下为匿名可偏向(这里是默认去除偏向延迟的,可以加上-XX:BiasedLockingStartupDelay=0),当有一个线程过来时,会偏向当前线程,而多个线程交替执行(即一个线程执行完再执行下一个,永远不会出现两个线程同时在锁临界区内),则会升级为轻量锁,而多个线程竞争(两个或以上线程同时在临界区中),而在计算hash值之后,匿名偏向计算hash后加锁则升级为轻量锁,加锁后计算hash则直接升级为重量锁。

  • 相关阅读:
    MAVEN 配置阿里云源
    Windows10远程连接CentOS7(搭建Xrdp服务器)
    在jsp显示图片
    unbuntu自动任务定时重启
    eager模式与自定义训练
    JDK安装
    VMware克隆Linux虚拟机注意事项
    系统字符编码
    Iptables防火墙
    查看系统信息
  • 原文地址:https://www.cnblogs.com/gmt-hao/p/14151951.html
Copyright © 2011-2022 走看看