zoukankan      html  css  js  c++  java
  • .NET 排序 Array.Sort<T> 实现分析

    System.Array.Sort<T> 是.NET内置的排序方法, 灵活且高效, 大家都学过一些排序算法,比如冒泡排序,插入排序,堆排序等,不过你知道这个方法背后使用了什么排序算法吗?

    先说结果, 实际上 Array.Sort 不止使用了一种排序算法, 为了保证不同的数据量的排序场景,都能有一个高性能的表现,实现中包括了插入排序,堆排序和快速排序, 接下来从通过源码看看它都做了哪些事情。

    Array.Sort

    https://source.dot.net/#System.Private.CoreLib/Array.cs,ec5718fae85b7640

    public static void Sort<T>(T[] array)
    {
        if (array == null)
            ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);
    
        if (array.Length > 1)
        {
            var span = new Span<T>(ref MemoryMarshal.GetArrayDataReference(array), array.Length);
            ArraySortHelper<T>.Default.Sort(span, null);
        }
    }
    

    这里我们对 int 数组进行排序, 先看一下这个Sort方法, 当数组的长度大于1时, 会先把数组转成 Span 列表, 然后调用了内部的ArraySortHelper的Default对象的Sort方法。

    ArraySortHelper

    [TypeDependency("System.Collections.Generic.GenericArraySortHelper`1")]
    internal sealed partial class ArraySortHelper<T>
        : IArraySortHelper<T>
    {
        private static readonly IArraySortHelper<T> s_defaultArraySortHelper = CreateArraySortHelper();
    
        public static IArraySortHelper<T> Default => s_defaultArraySortHelper;
    
        [DynamicDependency("#ctor", typeof(GenericArraySortHelper<>))]
        private static IArraySortHelper<T> CreateArraySortHelper()
        {
            IArraySortHelper<T> defaultArraySortHelper;
    
            if (typeof(IComparable<T>).IsAssignableFrom(typeof(T)))
            {
                defaultArraySortHelper = (IArraySortHelper<T>)RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter((RuntimeType)typeof(GenericArraySortHelper<string>), (RuntimeType)typeof(T));
            }
            else
            {
                defaultArraySortHelper = new ArraySortHelper<T>();
            }
            return defaultArraySortHelper;
        }
    }
    

    Default 会根据是否实现了 IComparable<T> 接口来创建不同的 ArraySortHelper, 因为上面我对int数组进行排序, 所以调用的是 GenericArraySortHelper 的Sort方法。

    GenericArraySortHelper

    https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,280

    internal sealed partial class GenericArraySortHelper<T>
            where T : IComparable<T>
        {
        // Do not add a constructor to this class because ArraySortHelper<T>.CreateSortHelper will not execute it
    
        #region IArraySortHelper<T> Members
    
        public void Sort(Span<T> keys, IComparer<T>? comparer)
        {
            try
            {
                if (comparer == null || comparer == Comparer<T>.Default)
                {
                    if (keys.Length > 1)
                    {
                        // For floating-point, do a pre-pass to move all NaNs to the beginning
                        // so that we can do an optimized comparison as part of the actual sort
                        // on the remainder of the values.
                        if (typeof(T) == typeof(double) ||
                            typeof(T) == typeof(float) ||
                            typeof(T) == typeof(Half))
                        {
                            int nanLeft = SortUtils.MoveNansToFront(keys, default(Span<byte>));
                            if (nanLeft == keys.Length)
                            {
                                return;
                            }
                            keys = keys.Slice(nanLeft);
                        }
    
                        IntroSort(keys, 2 * (BitOperations.Log2((uint)keys.Length) + 1));
                    }
                }
                else
                {
                    ArraySortHelper<T>.IntrospectiveSort(keys, comparer.Compare);
                }
            }
            catch (IndexOutOfRangeException)
            {
                ThrowHelper.ThrowArgumentException_BadComparer(comparer);
            }
            catch (Exception e)
            {
                ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_IComparerFailed, e);
            }
        }
    
    

    首先会判断排序的类型是否是浮点型, 如果是的会做一些排序的调整优化,然后调用了 IntroSort 方法,并传入了两个参数,第一个Keys就是数组的Span列表,那第二个是什么呢? 它是一个int类型的depthLimit参数,这里简单点理解就是算出数组的深度,因为后边会根据这个值进行递归操作,然后进入到 IntroSort 方法。

    IntroSort

    到这个方法这里就清晰很多了, 这是Array.Sort<T> 排序的主要内容,接着往下看

    https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,404

     private static void IntroSort(Span<T> keys, int depthLimit)
    {
        Debug.Assert(!keys.IsEmpty);
        Debug.Assert(depthLimit >= 0);
    
        int partitionSize = keys.Length;
        while (partitionSize > 1)
        {
            if (partitionSize <= Array.IntrosortSizeThreshold)
            {
                if (partitionSize == 2)
                {
                    SwapIfGreater(ref keys[0], ref keys[1]);
                    return;
                }
    
                if (partitionSize == 3)
                {
                    ref T hiRef = ref keys[2];
                    ref T him1Ref = ref keys[1];
                    ref T loRef = ref keys[0];
    
                    SwapIfGreater(ref loRef, ref him1Ref);
                    SwapIfGreater(ref loRef, ref hiRef);
                    SwapIfGreater(ref him1Ref, ref hiRef);
                    return;
                }
    
                InsertionSort(keys.Slice(0, partitionSize));
                return;
            }
    
            if (depthLimit == 0)
            {
                HeapSort(keys.Slice(0, partitionSize));
                return;
            }
            depthLimit--;
    
            int p = PickPivotAndPartition(keys.Slice(0, partitionSize));
    
            // Note we've already partitioned around the pivot and do not have to move the pivot again.
            IntroSort(keys[(p+1)..partitionSize], depthLimit);
            partitionSize = p;
        }
    }
    
    

    第一次进入方法时,partitionSize 就是数组的长度, 这里有一个判断条件,如下, IntrosortSizeThreshold 是一个值为16的常量,它是一个阈值, 如果数组的长度小于等于16, 那么使用的就是插入排序(InsertionSort), 为什么是16呢?这里通过注释了解到, 从经验上来看, 16及以下得数组长度使用插入排序的效率是比较高的。

    if (partitionSize <= Array.IntrosortSizeThreshold)
    {
        if (partitionSize == 2)
        {
            SwapIfGreater(ref keys[0], ref keys[1]);
            return;
        }
    
        if (partitionSize == 3)
        {
            ref T hiRef = ref keys[2];
            ref T him1Ref = ref keys[1];
            ref T loRef = ref keys[0];
    
            SwapIfGreater(ref loRef, ref him1Ref);
            SwapIfGreater(ref loRef, ref hiRef);
            SwapIfGreater(ref him1Ref, ref hiRef);
            return;
        }
    
        InsertionSort(keys.Slice(0, partitionSize));
        return;
    }
    

    InsertionSort

    如果数组的长度小于等于3时, 直接进行对比交换, 如果长度大约3并且小于等于16的话, 使用插入排序(InsertionSort), 方法内容如下:

    https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,537

    private static void InsertionSort(Span<T> keys)
    {
        for (int i = 0; i < keys.Length - 1; i++)
        {
            T t = Unsafe.Add(ref MemoryMarshal.GetReference(keys), i + 1);
    
            int j = i;
            while (j >= 0 && (t == null || LessThan(ref t, ref Unsafe.Add(ref MemoryMarshal.GetReference(keys), j))))
            {
                Unsafe.Add(ref MemoryMarshal.GetReference(keys), j + 1) = Unsafe.Add(ref MemoryMarshal.GetReference(keys), j);
                j--;
            }
    
            Unsafe.Add(ref MemoryMarshal.GetReference(keys), j + 1) = t!;
        }
    }
    

    HeapSort

    if (depthLimit == 0)
    {
        HeapSort(keys.Slice(0, partitionSize));
        return;
    }
    depthLimit--;
    

    因为后边是递归操作,所以每次 depthLimit 都会减1, 当深度为0排序还没有完成的时候,就会直接使用堆排序(HeapSort),方法内容如下:

    https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,990

    
    private static void HeapSort(Span<TKey> keys, Span<TValue> values)
    {
        Debug.Assert(!keys.IsEmpty);
    
        int n = keys.Length;
        for (int i = n >> 1; i >= 1; i--)
        {
            DownHeap(keys, values, i, n);
        }
    
        for (int i = n; i > 1; i--)
        {
            Swap(keys, values, 0, i - 1);
            DownHeap(keys, values, 1, i - 1);
        }
    }
    
    private static void DownHeap(Span<TKey> keys, Span<TValue> values, int i, int n)
    {
        TKey d = keys[i - 1];
        TValue dValue = values[i - 1];
    
        while (i <= n >> 1)
        {
            int child = 2 * i;
            if (child < n && (keys[child - 1] == null || LessThan(ref keys[child - 1], ref keys[child])))
            {
                child++;
            }
    
            if (keys[child - 1] == null || !LessThan(ref d, ref keys[child - 1]))
                break;
    
            keys[i - 1] = keys[child - 1];
            values[i - 1] = values[child - 1];
            i = child;
        }
    
        keys[i - 1] = d;
        values[i - 1] = dValue;
    }
    

    QuickSort

    int p = PickPivotAndPartition(keys.Slice(0, partitionSize), values.Slice(0, partitionSize));
     
    IntroSort(keys[(p+1)..partitionSize], values[(p+1)..partitionSize], depthLimit);
    partitionSize = p;
    

    这里调用了另外一个方法 PickPivotAndPartition,
    Pivot 基准, Partition 分区, 这就是快速排序呀!而且还是使用了尾递归的快速排序,其中也使用了三数取中法,方法内容如下

    https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,945

    private static int PickPivotAndPartition(Span<TKey> keys, Span<TValue> values)
    {
        Debug.Assert(keys.Length >= Array.IntrosortSizeThreshold);
    
        int hi = keys.Length - 1;
    
        // Compute median-of-three.  But also partition them, since we've done the comparison.
        int middle = hi >> 1;
    
        // Sort lo, mid and hi appropriately, then pick mid as the pivot.
        SwapIfGreaterWithValues(keys, values, 0, middle);  // swap the low with the mid point
        SwapIfGreaterWithValues(keys, values, 0, hi);   // swap the low with the high
        SwapIfGreaterWithValues(keys, values, middle, hi); // swap the middle with the high
    
        TKey pivot = keys[middle];
        Swap(keys, values, middle, hi - 1);
        int left = 0, right = hi - 1;  // We already partitioned lo and hi and put the pivot in hi - 1.  And we pre-increment & decrement below.
    
        while (left < right)
        {
            if (pivot == null)
            {
                while (left < (hi - 1) && keys[++left] == null) ;
                while (right > 0 && keys[--right] != null) ;
            }
            else
            {
                while (GreaterThan(ref pivot, ref keys[++left])) ;
                while (LessThan(ref pivot, ref keys[--right])) ;
            }
    
            if (left >= right)
                break;
    
            Swap(keys, values, left, right);
        }
    
        // Put pivot in the right location.
        if (left != hi - 1)
        {
            Swap(keys, values, left, hi - 1);
        }
        return left;
    }
    

    总结

    本文主要介绍了System.Array.Sort<T> 排序的内部实现, 发现它使用了插入排序,堆排序和快速排序,大家有兴趣可以看一下Java或者Golang的排序实现,希望对您有用。

  • 相关阅读:
    python中a = a+b与a += b的不同
    python中的全局变量global
    python中星号(*)和双星号(**)的用法
    python循环语句
    python逻辑运算符
    python内置函数 print()
    python 解析迅雷下载链接
    python 正则表达式
    python 读写文件
    python selenium操作cookie
  • 原文地址:https://www.cnblogs.com/myshowtime/p/15355161.html
Copyright © 2011-2022 走看看