刚看了篇文章 《Linq的Distinct太不给力了》,文中给出了一个解决办法,略显复杂。
试想如果能写成下面的样子,是不是更简单优雅:
1 2 |
var p1 = products.Distinct(p => p.ID); var p2 = products.Distinct(p => p.Name); |
使用一个简单的 lambda 作为参数,也符合 Linq 一贯的风格。
可通过扩展方法实现:
Distinct 扩展方法
首先,创建一个通用比较的类,实现 IEqualityComparer<T> 接口:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
using System; using System.Collections.Generic; using System.Runtime.CompilerServices; using System.Linq; public class CommonEqualityComparer<T, V> : IEqualityComparer<T> { private Func<T, V> keySelector; public CommonEqualityComparer(Func<T, V> keySelector) { this.keySelector = keySelector; } public bool Equals(T x, T y) { return EqualityComparer<V>.Default.Equals(keySelector(x), keySelector(y)); } public int GetHashCode(T obj) { return EqualityComparer<V>.Default.GetHashCode(keySelector(obj)); } } |
第 17 行,用到了 EqualityComparer<T> 类,本文最后有简要说明。
借助上面这个类,Distinct 扩展方法就很好写了:
1 2 3 4 5 6 7 |
public static class DistinctExtensions { public static IEnumerable<T> Distinct<T, V>(this IEnumerable<T> source, Func<T, V> keySelector) { return source.Distinct(new CommonEqualityComparer<T, V>(keySelector)); } } |
呵呵,简单吧!
Distinct 使用示例
根据 ID :
1 2 3 4 5 6 7 |
var data1 = new Person[] { new Person{ ID = 1, Name = "鹤冲天"}, new Person{ ID = 1, Name = "ldp"} }; var ps1 = data1 .Distinct(p => p.ID) .ToArray(); |
根据 Name:
1 2 3 4 5 7 |
var data2 = new Person[] { new Person{ ID = 1, Name = "鹤冲天"}, new Person{ ID = 2, Name = "鹤冲天"} }; var ps2 = data2 .Distinct(p => p.Name) .ToArray(); |
看了回复后,我做了些改进,推荐使用下面的方式:
改进
回复中有朋友提到“不区分大小写地排除重复的字符串”,也不难实现,只需要把上面的代码改进下就 OK:
CommonEqualityComparer<T, V> 类:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
using System; using System.Collections.Generic; using System.Runtime.CompilerServices; using System.Linq; public class CommonEqualityComparer<T, V> : IEqualityComparer<T> { private Func<T, V> keySelector; private IEqualityComparer<V> comparer; public CommonEqualityComparer(Func<T, V> keySelector, IEqualityComparer<V> comparer) { this.keySelector = keySelector; this.comparer = comparer; } public CommonEqualityComparer(Func<T, V> keySelector) : this(keySelector, EqualityComparer<V>.Default) { } public bool Equals(T x, T y) { return comparer.Equals(keySelector(x), keySelector(y)); } public int GetHashCode(T obj) { return comparer.GetHashCode(keySelector(obj)); } } |
Distinct 扩展方法:
1 2 3 4 5 6 7 8 9 10 11 12 |
public static class DistinctExtensions { public static IEnumerable<T> Distinct<T, V>(this IEnumerable<T> source, Func<T, V> keySelector) { return source.Distinct(new CommonEqualityComparer<T, V>(keySelector)); } public static IEnumerable<T> Distinct<T, V>(this IEnumerable<T> source, Func<T, V> keySelector, IEqualityComparer<V> comparer) { return source.Distinct(new CommonEqualityComparer<T, V>(keySelector, comparer)); } } |
借助可选参数,这两个扩展方法也可以合成一个:
1 2 3 4 5 |
public static IEnumerable<T> Distinct<T, V>(this IEnumerable<T> source, Func<T, V> keySelector, IEqualityComparer<V> comparer = EqualityComparer<V>.Default) { return source.Distinct(new CommonEqualityComparer<T, V>(keySelector, comparer)); } |
(同样,CommonEqualityComparer<T, V>类的两个构造函数也可以合二为一)
使用示例:
1 2 3 4 5 6 7 |
var data3 = new Person[] { new Person{ ID = 1, Name = "LDP"}, new Person{ ID = 2, Name = "ldp"} }; var ps3 = data3 .Distinct(p => p.Name, StringComparer.CurrentCultureIgnoreCase) .ToArray(); |
EqualityComparer<T> 类 简要说明
EqualityComparer<T>为 IEqualityComparer<T> 泛型接口的实现提供基类,它在 .net 4 中有五个重要的子类,见下图:
这五个子类分别用不同类型数据的相等性比较,从类名我们可以略知一二。
这五个子类都是内部类(internal),不能直接访问,EqualityComparer<T> 类提供一个简单的属性 Default。EqualityComparer<T> 会根据传入的 T 的类型,加载不同的子类,并会予以缓存提高性能。