zoukankan      html  css  js  c++  java
  • Java面试-List中的sort详细解读

    最近看了一些排序相关的文章,因此比较好奇,Java中的排序是如何做的。本片文章介绍的是JDK1.8,List中的sort方法。

    先来看看List中的sort是怎么写的:

        @SuppressWarnings({"unchecked", "rawtypes"})
        default void sort(Comparator<? super E> c) {
            Object[] a = this.toArray();
            Arrays.sort(a, (Comparator) c);
            ListIterator<E> i = this.listIterator();
            for (Object e : a) {
                i.next();
                i.set((E) e);
            }
        }
    

    首先,你需要传入一个比较器作为参数,这个好理解,毕竟你肯定要定一个比较标准。然后就是将list转换成一个数组,再对这个数组进行排序,排序完之后,再利用iterator重新改变list。

    接着,我们再来看看Arrays.sort:

        public static <T> void sort(T[] a, Comparator<? super T> c) {
            if (c == null) {
                sort(a);
            } else {
                if (LegacyMergeSort.userRequested)
                    legacyMergeSort(a, c);
                else
                    TimSort.sort(a, 0, a.length, c, null, 0, 0);
            }
        }
    
        public static void sort(Object[] a) {
            if (LegacyMergeSort.userRequested)
                legacyMergeSort(a);
            else
                ComparableTimSort.sort(a, 0, a.length, null, 0, 0);
        }
    
        static final class LegacyMergeSort {
            private static final boolean userRequested =
                java.security.AccessController.doPrivileged(
                    new sun.security.action.GetBooleanAction(
                        "java.util.Arrays.useLegacyMergeSort")).booleanValue();
        }
    

    这样可以看出,其实排序的核心就是TimSort,LegacyMergeSort大致意思是表明如果版本很旧的话,就用这个,新版本是不会采用这种排序方式的。

    我们再来看看TimSort的实现:

        private static final int MIN_MERGE = 32;
        static <T> void sort(T[] a, int lo, int hi, Comparator<? super T> c,
                             T[] work, int workBase, int workLen) {
            assert c != null && a != null && lo >= 0 && lo <= hi && hi <= a.length;
    
            int nRemaining  = hi - lo;
            if (nRemaining < 2)
                return;  // Arrays of size 0 and 1 are always sorted
    
            // If array is small, do a "mini-TimSort" with no merges
            if (nRemaining < MIN_MERGE) {
                // 获得最长的递增序列
                int initRunLen = countRunAndMakeAscending(a, lo, hi, c);
                binarySort(a, lo, hi, lo + initRunLen, c);
                return;
            }
    
            /**
             * March over the array once, left to right, finding natural runs,
             * extending short natural runs to minRun elements, and merging runs
             * to maintain stack invariant.
             */
            TimSort<T> ts = new TimSort<>(a, c, work, workBase, workLen);
            int minRun = minRunLength(nRemaining);
            do {
                // Identify next run
                int runLen = countRunAndMakeAscending(a, lo, hi, c);
    
                // If run is short, extend to min(minRun, nRemaining)
                if (runLen < minRun) {
                    int force = nRemaining <= minRun ? nRemaining : minRun;
                    binarySort(a, lo, lo + force, lo + runLen, c);
                    runLen = force;
                }
    
                // Push run onto pending-run stack, and maybe merge
                ts.pushRun(lo, runLen);
                ts.mergeCollapse();
    
                // Advance to find next run
                lo += runLen;
                nRemaining -= runLen;
            } while (nRemaining != 0);
    
            // Merge all remaining runs to complete sort
            assert lo == hi;
            ts.mergeForceCollapse();
            assert ts.stackSize == 1;
        }
    

    如果小于2个,代表不再不需要排序;如果小于32个,则采用优化的二分排序。怎么优化的呢?首先获得最长的递增序列:

        private static <T> int countRunAndMakeAscending(T[] a, int lo, int hi,
                                                        Comparator<? super T> c) {
            assert lo < hi;
            int runHi = lo + 1;
            if (runHi == hi)
                return 1;
    
            // Find end of run, and reverse range if descending
            if (c.compare(a[runHi++], a[lo]) < 0) { // Descending
                // 一开始是递减序列,就找出最长递减序列的最后一个下标
                while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) < 0)
                    runHi++;
                // 逆转前面的递减序列
                reverseRange(a, lo, runHi);
            } else {                              // Ascending
                while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) >= 0)
                    runHi++;
            }
    
            return runHi - lo;
        }
    

    接着进行二分排序:

        private static <T> void binarySort(T[] a, int lo, int hi, int start,
                                           Comparator<? super T> c) {
            assert lo <= start && start <= hi;
            if (start == lo)
                start++;
            for ( ; start < hi; start++) {
                T pivot = a[start];
    
                // Set left (and right) to the index where a[start] (pivot) belongs
                int left = lo;
                int right = start;
                assert left <= right;
                /*
                 * Invariants:
                 *   pivot >= all in [lo, left).
                 *   pivot <  all in [right, start).
                 */
                // start位置是递增序列后的第一个数的位置
                // 从前面的递增序列中找出start位置的数应该处于的位置
                while (left < right) {
                    // >>> 无符号右移
                    int mid = (left + right) >>> 1;
                    if (c.compare(pivot, a[mid]) < 0)
                        right = mid;
                    else
                        left = mid + 1;
                }
                assert left == right;
    
                /*
                 * The invariants still hold: pivot >= all in [lo, left) and
                 * pivot < all in [left, start), so pivot belongs at left.  Note
                 * that if there are elements equal to pivot, left points to the
                 * first slot after them -- that's why this sort is stable.
                 * Slide elements over to make room for pivot.
                 */
                int n = start - left;  // The number of elements to move
                // Switch is just an optimization for arraycopy in default case
                // 比pivot大的数往后移动一位
                switch (n) {
                    case 2:  a[left + 2] = a[left + 1];
                    case 1:  a[left + 1] = a[left];
                             break;
                    default: System.arraycopy(a, left, a, left + 1, n);
                }
                a[left] = pivot;
            }
        }
    

    好了,待排序数量小于32个的讲完了,现在来说说大于等于32个情况。首先,获得一个叫minRun的东西,这是个啥含义呢:

        int minRun = minRunLength(nRemaining);
        private static int minRunLength(int n) {
            assert n >= 0;
            int r = 0;      // Becomes 1 if any 1 bits are shifted off
            while (n >= MIN_MERGE) {
                // 这里我没搞懂的是为什么不直接将(n & 1)赋值给r,而要做一次逻辑或。
                r |= (n & 1);
                n >>= 1;
            }
            return n + r;
        }
    

    各种位运算符,MIN_MERGE默认为32,如果n小于此值,那么返回n本身。否则会将n不断地右移,直到小于MIN_MERGE,同时记录一个r值,r代表最后一次移位n时,n最低位是0还是1。
    其实看注释比较容易理解:

    Returns the minimum acceptable run length for an array of the specified length. Natural runs shorter than this will be extended with binarySort.
    Roughly speaking, the computation is: If n < MIN_MERGE, return n (it's too small to bother with fancy stuff).
    Else if n is an exact power of 2, return MIN_MERGE/2.
    Else return an int k, MIN_MERGE/2 <= k <= MIN_MERGE, such that n/k is close to, but strictly less than, an exact power of 2. For the rationale, see listsort.txt.
    

    返回结果其实就是用于接下来的合并排序中。

    接下来就是一个while循环

            do {
                // Identify next run
                // 获得一个最长递增序列
                int runLen = countRunAndMakeAscending(a, lo, hi, c);
    
                // If run is short, extend to min(minRun, nRemaining)
                // 如果最长递增序列
                if (runLen < minRun) {
                    int force = nRemaining <= minRun ? nRemaining : minRun;
                    binarySort(a, lo, lo + force, lo + runLen, c);
                    runLen = force;
                }
    
                // Push run onto pending-run stack, and maybe merge
                // lo——runLen为将要被归并的范围
                ts.pushRun(lo, runLen);
                // 归并
                ts.mergeCollapse();
    
                // Advance to find next run
                lo += runLen;
                nRemaining -= runLen;
            } while (nRemaining != 0);
    

    这样,假设你的每次归并排序的两个序列为r1和r2,r1肯定是有序的,r2也已经被排成递增序列了,因此这样的归并排序就比较特殊了。

    为什么要用归并排序呢,因为归并排序的时间复杂度永远为O(nlogn),空间复杂度为O(n),以空间换取时间。

    好了,以上就是针对Java中的排序做的一次总结,但具体的归并代码还没有分析,其实我自己也没有完全研究透,为什么minRun的取值是这样的,这也和TimSort中的stackLen有关,有兴趣的小伙伴可以在下方留言,我们可以一起探讨。
    有兴趣的话可以关注我的公众号,说不定会有意外的惊喜。

    在这里插入图片描述

  • 相关阅读:
    【C#学习笔记】 IDisposable 接口
    【C#学习笔记】 List.AddRange 方法
    Request a certificate from a certificate vendor
    How to install your SSL Certificate to your Windows Server
    How to generate a CSR in Microsoft IIS 7
    我爱你 肖厦
    OAuth2.0协议之新浪微博接口演示
    重写mouseEvent 事件 怎么实现自定义的无边框窗口移动
    原文地址:Qt数据库总结 作者:ImmenseeT
    QT+MySQL图片插入数据库并显示 2013-03-12 13:58:52
  • 原文地址:https://www.cnblogs.com/death00/p/11495434.html
Copyright © 2011-2022 走看看