zoukankan      html  css  js  c++  java
  • C# Dictionary, SortedDictionary, SortedList

    就我个人觉得Dictionary, SortedDictionary, SortedList 这几个类的使用是比较简单的,只要稍微花点时间在网上查找一点资料,然后在阅读以下源码就理解的很清楚了。为什么要写这一片文章了,看一下code吧:

    Dictionary<int, object> dict = new Dictionary<int, object>();
    //load data to dict
    int key = 1;
    object obj = null;
    if (dict.ContainsKey(key))
    {
    obj = dict[key];
    }

    本来程序在初始化的时候会初始化一个Dictionary,然后在程序很多地方需要读Dictionary,然后一同事刚开始就是这样写的code,后来说字典查找ContainsKey比较慢,所以就改为SortedDictionary,按照key排序的字典。 而我一般是用普通的Dictionary的 dict.TryGetValue(key, out  obj)方法就可以了。所以就有了这篇文章,先说一下 结论吧:

    Dictionary<TKey,TValue>泛型类提供了从一组键到一组值的映射。字典中的每个添加项都由一个值及其相关联的键组成。通过键来检索值的速度是非常快的,接近于 O(1),这是因为Dictionary<TKey,TValue>类是作为一个哈希表来实现的。检索速度取决于为 TKey 指定的类型的哈希算法的质量。
    SortedDictionary<TKey, TValue>泛型类是检索运算复杂度为 O(log n) 的二叉搜索树,其中n是字典中的元素数。就这一点而言,它与SortedList<TKey, TValue>泛型类相似。这两个类具有相似的对象模型,并且都具有O(logn)的检索运算复杂度。这两个类的区别在于内存的使用以及插入和移除元素的速度:
    SortedList<TKey, TValue>使用的内存比SortedDictionary<TKey, TValue>少。SortedDictionary<TKey, TValue>可对未排序的数据执行更快的插入和移除操作:它的时间复杂度为O(logn),而SortedList<TKey, TValue>为 O(n)。如果使用排序数据一次性填充列表,则SortedList<TKey, TValue>比SortedDictionary<TKey, TValue>快

    首先来看Dictionary的实现:

     public class Dictionary<TKey,TValue>: IDictionary<TKey,TValue>, IDictionary, IReadOnlyDictionary<TKey, TValue>, ISerializable, IDeserializationCallback  {
        {
            private struct Entry {
                public int hashCode;    // Lower 31 bits of hash code, -1 if unused
                public int next;        // Index of next entry, -1 if last
                public TKey key;           // Key of entry
                public TValue value;         // Value of entry
            }
    
            private int[] buckets;
            private Entry[] entries;
            private IEqualityComparer<TKey> comparer;
            
            public Dictionary(int capacity): this(capacity, null) {}
    
            public Dictionary(IEqualityComparer<TKey> comparer): this(0, comparer) {}
    
            public Dictionary(int capacity, IEqualityComparer<TKey> comparer) {
                if (capacity < 0) ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.capacity);
                if (capacity > 0) Initialize(capacity);
                this.comparer = comparer ?? EqualityComparer<TKey>.Default;
            } 
            
             private void Initialize(int capacity) {
                int size = HashHelpers.GetPrime(capacity);
                buckets = new int[size];
                for (int i = 0; i < buckets.Length; i++) buckets[i] = -1;
                entries = new Entry[size];
                freeList = -1;
            }
            
            public TValue this[TKey key] {
                get {
                    int i = FindEntry(key);
                    if (i >= 0) return entries[i].value;
                    ThrowHelper.ThrowKeyNotFoundException();
                    return default(TValue);
                }
                set {
                    Insert(key, value, false);
                }
            }
            
            private int FindEntry(TKey key) {
                if( key == null) {
                    ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
                }
    
                if (buckets != null) {
                    int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
                    for (int i = buckets[hashCode % buckets.Length]; i >= 0; i = entries[i].next) {
                        if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) return i;
                    }
                }
                return -1;
            }
            
            private void Insert(TKey key, TValue value, bool add) {
            
                if( key == null ) {
                    ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
                }
    
                if (buckets == null) Initialize(0);
                int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
                int targetBucket = hashCode % buckets.Length;
    
                for (int i = buckets[targetBucket]; i >= 0; i = entries[i].next) {
                    if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) {
                        if (add) { 
                            ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate);
                        }
                        entries[i].value = value;
                        version++;
                        return;
                    } 
                }
                int index;
                if (freeCount > 0) {
                    index = freeList;
                    freeList = entries[index].next;
                    freeCount--;
                }
                else {
                    if (count == entries.Length)
                    {
                        Resize();
                        targetBucket = hashCode % buckets.Length;
                    }
                    index = count;
                    count++;
                }
    
                entries[index].hashCode = hashCode;
                entries[index].next = buckets[targetBucket];
                entries[index].key = key;
                entries[index].value = value;
                buckets[targetBucket] = index;
                version++;
    
                if(collisionCount > HashHelpers.HashCollisionThreshold && HashHelpers.IsWellKnownEqualityComparer(comparer)) 
                {
                    comparer = (IEqualityComparer<TKey>) HashHelpers.GetRandomizedEqualityComparer(comparer);
                    Resize(entries.Length, true);
                }
            }
        }

    Dictionary<TKey,TValue>的数据成员转换为Entry结构,真正保存数据的是这里的Entry[] entries 数组,第一个元素小标为0,第二个为1......,但是查找和添加Dictionary<TKey,TValue>我们都是通过key来实现的,那么一个key究竟对应哪一个下标了,就需要这里的int[] buckets数组了。就如这里的FindEntry方法一样,首先获取key的哈希值获取buckets的下标(比如一个初始化为100个元素的字典,计算出来再buckets中的第50个元素),buckets 对应的值就是entries 数组的下标(buckets[50]=0,那么就应该取entries[0]的值了)。如果字典的元素个数是可以确定的话,那么建议指定capacity

    int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
    int targetBucket = hashCode % buckets.Length;

    entries[index].hashCode = hashCode;
    entries[index].next = buckets[targetBucket];
    entries[index].key = key;
    entries[index].value = value;
    buckets[targetBucket] = index;

    现在我们来看看SortedList的实现:

        public class SortedList<TKey, TValue> : IDictionary<TKey, TValue>, System.Collections.IDictionary, IReadOnlyDictionary<TKey, TValue>
        {
            private TKey[] keys;
            private TValue[] values;
            private IComparer<TKey> comparer;
            public SortedList(int capacity) {
                if (capacity < 0)
                    ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.capacity, ExceptionResource.ArgumentOutOfRange_NeedNonNegNumRequired);
                keys = new TKey[capacity];
                values = new TValue[capacity];
                comparer = Comparer<TKey>.Default;
            }
            
            public SortedList(IDictionary<TKey, TValue> dictionary, IComparer<TKey> comparer) 
                : this((dictionary != null ? dictionary.Count : 0), comparer) {
                if (dictionary==null)
                    ThrowHelper.ThrowArgumentNullException(ExceptionArgument.dictionary);
    
                dictionary.Keys.CopyTo(keys, 0);
                dictionary.Values.CopyTo(values, 0);
                Array.Sort<TKey, TValue>(keys, values, comparer);
                _size = dictionary.Count;            
            }
            
            public TValue this[TKey key] {
                get {
                    int i = IndexOfKey(key);
                    if (i >= 0)
                        return values[i];
    
                    ThrowHelper.ThrowKeyNotFoundException();
                    return default(TValue);
                }
                set {
                    if (((Object) key) == null) ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
                    int i = Array.BinarySearch<TKey>(keys, 0, _size, key, comparer);
                    if (i >= 0) {
                        values[i] = value;
                        version++;
                        return;
                    }
                    Insert(~i, key, value);
                }
            }
            
             public int IndexOfKey(TKey key) {
                if (key == null) 
                    ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
                int ret = Array.BinarySearch<TKey>(keys, 0, _size, key, comparer);
                return ret >=0 ? ret : -1;
            }
            
            private void Insert(int index, TKey key, TValue value) {
                if (_size == keys.Length) EnsureCapacity(_size + 1);
                if (index < _size) {
                    Array.Copy(keys, index, keys, index + 1, _size - index);
                    Array.Copy(values, index, values, index + 1, _size - index);
                }
                keys[index] = key;
                values[index] = value;
                _size++;
                version++;
            }
        }

    SortedList<TKey, TValue>的key和value分别存在TKey[] keys和TValue[] values数组里面,但是查找key用的不是哈希算法,而是二分查找 Array.BinarySearch<TKey>(keys, 0, _size, key, comparer),但是插入的时候却有

    if (index < _size) {
    Array.Copy(keys, index, keys, index + 1, _size - index);
    Array.Copy(values, index, values, index + 1, _size - index);
    }这样的code,意思就是如果SortedList里面已经有10个值,如果新插入的值应该是第一个, 那么需要把后面10个元素依次移动一个位置。移除元素也有类似的情况。

    最后我们来看SortedDictionary的实现:

        public class SortedDictionary<TKey, TValue> : IDictionary<TKey, TValue>, IDictionary, IReadOnlyDictionary<TKey, TValue> 
         {
           public SortedDictionary(IDictionary<TKey,TValue> dictionary, IComparer<TKey> comparer) {
                if( dictionary == null) {
                    ThrowHelper.ThrowArgumentNullException(ExceptionArgument.dictionary);
                }
    
                _set = new TreeSet<KeyValuePair<TKey, TValue>>(new KeyValuePairComparer(comparer));
    
                foreach(KeyValuePair<TKey, TValue> pair in dictionary) {
                    _set.Add(pair);
                }            
            }
    
            public SortedDictionary(IComparer<TKey> comparer) {
                _set = new TreeSet<KeyValuePair<TKey, TValue>>(new KeyValuePairComparer(comparer));
            }
            
           public TValue this[TKey key] {
                get {
                    if ( key == null) {
                        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);                    
                    }
    
                    TreeSet<KeyValuePair<TKey, TValue>>.Node node = _set.FindNode(new KeyValuePair<TKey, TValue>(key, default(TValue)));
                    if ( node == null) {
                        ThrowHelper.ThrowKeyNotFoundException();                    
                    }
    
                    return node.Item.Value;
                }
                set {
                    if( key == null) {
                        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
                    }
                
                    TreeSet<KeyValuePair<TKey, TValue>>.Node node = _set.FindNode(new KeyValuePair<TKey, TValue>(key, default(TValue)));
                    if ( node == null) {
                        _set.Add(new KeyValuePair<TKey, TValue>(key, value));                        
                    } else {
                        node.Item = new KeyValuePair<TKey, TValue>( node.Item.Key, value);
                        _set.UpdateVersion();
                    }
                }
            }
         internal class TreeSet<T> : SortedSet<T> {}
             
          }
          
        public class SortedSet<T> : ISet<T>, ICollection<T>, ICollection, ISerializable, IDeserializationCallback, IReadOnlyCollection<T> 
        {
            internal virtual Node FindNode(T item) {
                Node current = root;
                while (current != null) {
                    int order = comparer.Compare(item, current.Item);
                    if (order == 0) {
                        return current;
                    } else {
                        current = (order < 0) ? current.Left : current.Right;
                    }
                }
    
                return null;
            }
            
            public bool Add(T item) {
                return AddIfNotPresent(item);
            }
            
            internal virtual bool AddIfNotPresent(T item) {
                if (root == null) {   // empty tree
                    root = new Node(item, false);
                    count = 1;
                    version++;
                    return true;
                }
    
                //
                // Search for a node at bottom to insert the new node. 
                // If we can guanratee the node we found is not a 4-node, it would be easy to do insertion.
                // We split 4-nodes along the search path.
                // 
                Node current = root;
                Node parent = null;
                Node grandParent = null;
                Node greatGrandParent = null;
    
                //even if we don't actually add to the set, we may be altering its structure (by doing rotations
                //and such). so update version to disable any enumerators/subsets working on it
                version++;
    
    
                int order = 0;
                while (current != null) {
                    order = comparer.Compare(item, current.Item);
                    if (order == 0) {
                        // We could have changed root node to red during the search process.
                        // We need to set it to black before we return.
                        root.IsRed = false;
                        return false;
                    }
    
                    // split a 4-node into two 2-nodes                
                    if (Is4Node(current)) {
                        Split4Node(current);
                        // We could have introduced two consecutive red nodes after split. Fix that by rotation.
                        if (IsRed(parent)) {
                            InsertionBalance(current, ref parent, grandParent, greatGrandParent);
                        }
                    }
                    greatGrandParent = grandParent;
                    grandParent = parent;
                    parent = current;
                    current = (order < 0) ? current.Left : current.Right;
                }
    
                Debug.Assert(parent != null, "Parent node cannot be null here!");
                // ready to insert the new node
                Node node = new Node(item);
                if (order > 0) {
                    parent.Right = node;
                } else {
                    parent.Left = node;
                }
    
                // the new node will be red, so we will need to adjust the colors if parent node is also red
                if (parent.IsRed) {
                    InsertionBalance(node, ref parent, grandParent, greatGrandParent);
                }
    
                // Root node is always black
                root.IsRed = false;
                ++count;
                return true;
            }
    
        }

    SortedDictionary的实现基本是靠TreeSet<T> (SortedSet<T>)来完成的,它的查找和添加都是在一个红黑树里面实现的。

    Dictionary, SortedDictionary, SortedList 3个都有含类似IComparer<TKey> comparer的构造方法,Dictionary和SortedList 里面存储是用数组,所有它俩都有int capacity的指定,然而SortedDictionary依赖于树,所以没有该参数。所以Dictionary查找,插入、修改时间复杂度为O(1)(里面主要是哈希算法的时间,建议一个哈希桶里面存放一个元素),SortedList的查找时间复杂度为O(logn),但是插入和删除需要移动后面的元素,所以时间复杂 为O(n),SortedDictionary依赖于红黑树,所以查找、插入和修改 时间复杂度为O(logn)

  • 相关阅读:
    微信门户开发框架-使用指导说明书
    在Winform界面使用自定义用户控件及TabelPanel和StackPanel布局控件
    C++迟后联编和虚函数表
    PaX介绍——针对linux kernel的一个加固版本的补丁,是这个星球上有史以来最极端和最优秀的防御系统级别0day的方案
    侧信道攻击——基于从密码系统的物理实现中获取的信息而非暴力破解法或是算法中的理论性弱点(较之密码分析)。例如:时间信息、功率消耗、电磁泄露或甚是声音可以提供额外的信息来源作为破解输入
    默克尔树(merkle tree)——就是hash树,比特币区块链里用于校验完整性的
    spark RDD pipe 调用外部脚本
    AIDE(高级入侵检测环境)——就是讲文件的hash值存到db中,然后比较是否被篡改过
    完整性度量架构(IMA)介绍与分析——当应用程序运行、动态链接库加载、内核模块加载时,将用到的代码和关键数据(如配置文件和结构化数据)做一次hash比较的感觉
    AES中的ECB、CTR、MAC、GMAC、GCM
  • 原文地址:https://www.cnblogs.com/majiang/p/7878013.html
Copyright © 2011-2022 走看看