译文:http://www.cnblogs.com/tianfan/
LINQ provides a convenient syntax and many useful methods for operating with collections of objects. However, to be correctly processed by LINQ comparison methods such as Distinct or Intersect, a type must satisfy certain requirements.
Let’s take a look at the Distinct method, which returns all distinct objects from a collection.
List<int> numbers = new List<int> { 1, 1, 2, 3 };
var distinctNumbers = numbers.Distinct();
foreach (var number in distinctNumbers)
Console.WriteLine(number);
The output is:
1
2
3
But what if you want to use the Distinct method for a collection of objects of your own type? For example, like this:
class Number
{
public int Digital { get; set; }
public String Textual { get; set; }
}
class Program
{
static void Main(string[] args)
{
List<Number> numbers = new List<Number> {
new Number { Digital = 1, Textual = "one" },
new Number { Digital = 1, Textual = "one" } ,
new Number { Digital = 2, Textual = "two" } ,
new Number { Digital = 3, Textual = "three" } ,
};
var distinctNumbers = numbers.Distinct();
foreach (var number in distinctNumbers)
Console.WriteLine(number.Digital);
}
}
The code compiles, but the output is different:
1
1
2
3
Why did that happen? The answer is in the LINQ implementation details. To be correctly processed by the Distinct method, a type must implement the IEquatable<T> interface and provide its own Equals and GetHashCode methods.
So, the Number class from the previous example should actually look like this:
class Number: IEquatable<Number>
{
public int Digital { get; set; }
public String Textual { get; set; }
public bool Equals(Number other)
{
// Check whether the compared object is null.
if (Object.ReferenceEquals(other, null)) return false;
// Check whether the compared object references the same data.
if (Object.ReferenceEquals(this, other)) return true;
// Check whether the objects’ properties are equal.
return Digital.Equals(other.Digital) &&
Textual.Equals(other.Textual);
}
// If Equals returns true for a pair of objects,
// GetHashCode must return the same value for these objects.
public override int GetHashCode()
{
// Get the hash code for the Textual field if it is not null.
int hashTextual = Textual == null ? 0 : Textual.GetHashCode();
// Get the hash code for the Digital field.
int hashDigital = Digital.GetHashCode();
// Calculate the hash code for the object.
return hashDigital ^ hashTextual;
}
}
But what if you cannot modify the type? What if it was provided by a library and you have no way of implementing the IEquatable<T> interface in this type? The answer is to create your own equality comparer and pass it as a parameter to the Distinct method.
The equality comparer must implement the IEqualityComparer<T> interface and, again, provide GetHashCode and Equals methods.
Here is how the equality comparer for the original Number class might look:
class NumberComparer : IEqualityComparer<Number>
{
public bool Equals(Number x, Number y)
{
if (Object.ReferenceEquals(x, y)) return true;
if (Object.ReferenceEquals(x, null) ||
Object.ReferenceEquals(y, null))
return false;
return x.Digital == y.Digital && x.Textual == y.Textual;
}
public int GetHashCode(Number number)
{
if (Object.ReferenceEquals(number, null)) return 0;
int hashTextual = number.Textual == null
? 0 : number.Textual.GetHashCode();
int hashDigital = number.Digital.GetHashCode();
return hashTextual ^ hashDigital;
}
}
And don’t forget to pass the comparer to the Distinct method:
var distinctNumbers = numbers.Distinct(new NumberComparer());
Of course, these rules don't just apply to the Distinct method. For example, the same is true for the Contains, Except, Intersect, and Union methods. In general, if you see that a LINQ method has an overload that accepts the IEqualityComparer<T> parameter, it probably means that to use it for your own data types you need to either implement IEquatable<T> in your class or create your own equality comparer.