zoukankan      html  css  js  c++  java
  • Java性能漫谈-数组复制之System.arraycopy

    当我还年幼的时候,我很任性,复制数组也是,写一个for循环,来回倒腾,后来长大了,就发现了System.arraycopy的好处。

    为了测试俩者的区别我写了一个简单赋值int[100000]的程序来对比,并且中间使用了nanoTime来计算时间差:

    程序如下:

            int[] a = new int[100000];
            for(int i=0;i<a.length;i++){
                a[i] = i;
            }
            
            int[] b = new int[100000];
            
            int[] c = new int[100000];
            for(int i=0;i<c.length;i++){
                c[i] = i;
            }
            
            int[] d = new int[100000];
            
            for(int k=0;k<10;k++){
                long start1 = System.nanoTime();
                for(int i=0;i<a.length;i++){
                    b[i] = a[i];
                }
                long end1 = System.nanoTime();
                System.out.println("end1 - start1 = "+(end1-start1));
                
                
                long start2 = System.nanoTime();
                System.arraycopy(c, 0, d, 0, 100000);
                long end2 = System.nanoTime();
                System.out.println("end2 - start2 = "+(end2-start2));
                
                System.out.println();
            }

    为了避免内存不稳定干扰和运行的偶然性结果,我在一开始的时候把所有空间申明完成,并且只之后循环10次执行,得到如下结果:

    end1 - start1 = 366806
    end2 - start2 = 109154
    
    end1 - start1 = 380529
    end2 - start2 = 79849
    
    end1 - start1 = 421422
    end2 - start2 = 68769
    
    end1 - start1 = 344463
    end2 - start2 = 72020
    
    end1 - start1 = 333174
    end2 - start2 = 77277
    
    end1 - start1 = 377335
    end2 - start2 = 82285
    
    end1 - start1 = 370608
    end2 - start2 = 66937
    
    end1 - start1 = 349067
    end2 - start2 = 86532
    
    end1 - start1 = 389974
    end2 - start2 = 83362
    
    end1 - start1 = 347937
    end2 - start2 = 63638

    可以看出,System.arraycopy的性能很不错,为了看看究竟这个底层是如何处理的,我找到openJDK的一些代码留恋了一些:

    System.arraycopy是一个native函数,需要看native层的代码:

        public static native void arraycopy(Object src,  int  srcPos,
                                            Object dest, int destPos,
                                            int length);

    找到对应的openjdk6-src/hotspot/src/share/vm/prims/jvm.cpp,这里有JVM_ArrayCopy的入口:

    JVM_ENTRY(void, JVM_ArrayCopy(JNIEnv *env, jclass ignored, jobject src, jint src_pos,
                                   jobject dst, jint dst_pos, jint length))
      JVMWrapper("JVM_ArrayCopy");
      // Check if we have null pointers
      if (src == NULL || dst == NULL) {
        THROW(vmSymbols::java_lang_NullPointerException());
      }
      arrayOop s = arrayOop(JNIHandles::resolve_non_null(src));
      arrayOop d = arrayOop(JNIHandles::resolve_non_null(dst));
      assert(s->is_oop(), "JVM_ArrayCopy: src not an oop");
      assert(d->is_oop(), "JVM_ArrayCopy: dst not an oop");
      // Do copy
      Klass::cast(s->klass())->copy_array(s, src_pos, d, dst_pos, length, thread);
    JVM_END

    前面的语句都是判断,知道最后的copy_array(s, src_pos, d, dst_pos, length, thread)是真正的copy,进一步看这里,在openjdk6-src/hotspot/src/share/vm/oops/typeArrayKlass.cpp中:

    void typeArrayKlass::copy_array(arrayOop s, int src_pos, arrayOop d, int dst_pos, int length, TRAPS) {
      assert(s->is_typeArray(), "must be type array");
    
      // Check destination
      if (!d->is_typeArray() || element_type() != typeArrayKlass::cast(d->klass())->element_type()) {
        THROW(vmSymbols::java_lang_ArrayStoreException());
      }
    
      // Check is all offsets and lengths are non negative
      if (src_pos < 0 || dst_pos < 0 || length < 0) {
        THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
      }
      // Check if the ranges are valid
      if  ( (((unsigned int) length + (unsigned int) src_pos) > (unsigned int) s->length())
         || (((unsigned int) length + (unsigned int) dst_pos) > (unsigned int) d->length()) ) {
        THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
      }
      // Check zero copy
      if (length == 0)
        return;
    
      // This is an attempt to make the copy_array fast.
      int l2es = log2_element_size();
      int ihs = array_header_in_bytes() / wordSize;
      char* src = (char*) ((oop*)s + ihs) + ((size_t)src_pos << l2es);
      char* dst = (char*) ((oop*)d + ihs) + ((size_t)dst_pos << l2es);
      Copy::conjoint_memory_atomic(src, dst, (size_t)length << l2es);//还是在这里处理copy
    }

    这个函数之前的仍然是一堆判断,直到最后一句才是真实的拷贝语句。

    在openjdk6-src/hotspot/src/share/vm/utilities/copy.cpp中找到对应的函数:

    // Copy bytes; larger units are filled atomically if everything is aligned.
    void Copy::conjoint_memory_atomic(void* from, void* to, size_t size) {
      address src = (address) from;
      address dst = (address) to;
      uintptr_t bits = (uintptr_t) src | (uintptr_t) dst | (uintptr_t) size;
    
      // (Note:  We could improve performance by ignoring the low bits of size,
      // and putting a short cleanup loop after each bulk copy loop.
      // There are plenty of other ways to make this faster also,
      // and it's a slippery slope.  For now, let's keep this code simple
      // since the simplicity helps clarify the atomicity semantics of
      // this operation.  There are also CPU-specific assembly versions
      // which may or may not want to include such optimizations.)
    
      if (bits % sizeof(jlong) == 0) {
        Copy::conjoint_jlongs_atomic((jlong*) src, (jlong*) dst, size / sizeof(jlong));
      } else if (bits % sizeof(jint) == 0) {
        Copy::conjoint_jints_atomic((jint*) src, (jint*) dst, size / sizeof(jint));
      } else if (bits % sizeof(jshort) == 0) {
        Copy::conjoint_jshorts_atomic((jshort*) src, (jshort*) dst, size / sizeof(jshort));
      } else {
        // Not aligned, so no need to be atomic.
        Copy::conjoint_jbytes((void*) src, (void*) dst, size);
      }
    }

    上面的代码展示了选择哪个copy函数,我们选择conjoint_jints_atomic,在openjdk6-src/hotspot/src/share/vm/utilities/copy.hpp进一步查看:

    // jints,                 conjoint, atomic on each jint
      static void conjoint_jints_atomic(jint* from, jint* to, size_t count) {
        assert_params_ok(from, to, LogBytesPerInt);
        pd_conjoint_jints_atomic(from, to, count);
      }

    继续向下查看,在openjdk6-src/hotspot/src/cpu/zero/vm/copy_zero.hpp中:

    static void pd_conjoint_jints_atomic(jint* from, jint* to, size_t count) {
      _Copy_conjoint_jints_atomic(from, to, count);
    }

    继续向下查看,在openjdk6-src/hotspot/src/os_cpu/linux_zero/vm/os_linux_zero.cpp中:

    void _Copy_conjoint_jints_atomic(jint* from, jint* to, size_t count) {
        if (from > to) {
          jint *end = from + count;
          while (from < end)
            *(to++) = *(from++);
        }
        else if (from < to) {
          jint *end = from;
          from += count - 1;
          to   += count - 1;
          while (from >= end)
            *(to--) = *(from--);
        }
      }

    可以看到,直接就是内存块赋值的逻辑了,这样避免很多引用来回倒腾的时间,必然就变快了。

  • 相关阅读:
    Treap 树堆 容易实现的平衡树
    (转)Maven实战(二)构建简单Maven项目
    (转)Maven实战(一)安装与配置
    根据请求头跳转判断Android&iOS
    (转)苹果消息推送服务器 php 证书生成
    (转)How to renew your Apple Push Notification Push SSL Certificate
    (转)How to build an Apple Push Notification provider server (tutorial)
    (转)pem, cer, p12 and the pains of iOS Push Notifications encryption
    (转)Apple Push Notification Services in iOS 6 Tutorial: Part 2/2
    (转)Apple Push Notification Services in iOS 6 Tutorial: Part 1/2
  • 原文地址:https://www.cnblogs.com/yakovchang/p/java_system_arraycopy.html
Copyright © 2011-2022 走看看