问题
I am wondering if it is possible to use a HashSet and make the method Contains to return true if one of the field is in the hash for a giving object.
This is an example of what I would like
static void Main(string[] args)
{
HashSet<Product> hash = new HashSet<Product>();
// Since the Id is the same, both products are considered to be the same even if the URI is not the same
// The opposite is also true. If the URI is the same, both products are considered to be the same even if the Id is not the same
Product product1 = new Product("123", "www.test.com/123.html");
Product product2 = new Product("123", "www.test.com/123.html?lang=en");
hash.Add(product1);
if (hash.Contains(product2))
{
// I want the method "Contains" to return TRUE because one of the field is in the hash
}
}
Here is the definition of the class Product
public class Product
{
public string WebId
public string Uri
public Product(string Id, string uri)
{
WebId = Id;
Uri = uri;
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != typeof(Product)) return false;
return Equals((Product)obj);
}
public bool Equals(Product obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (String.Equals(WebId, obj.WebId) || String.Equals(Uri, obj.Uri))
return true;
else
return false;
}
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 23 + WebId.GetHashCode();
hash = hash * 23 + Uri.GetHashCode();
return hash;
}
}
}
When I run my program, the method Contains only runs GetHashCode and never the method Equals. Hence, the method Contains return FALSE.
How can I make my HashSet to return TRUE for the example above ? Should I be using a Dictionary instead and add each fields to the dictionary ?
回答1:
Your GetHashCode() implementation isn't guaranteed to return the same value for two objects that are equal. Since you only require a match on, say, WebId. The Uri then screws up the hash code. Or the other way around. You cannot fix this, other than by returning 0. That's going to kill the HashSet<> perf, lookup will be O(n) instead of O(1).
回答2:
In a recent project we had the same problem, where the class's Equals() implementation was logical ORing properties to determine equality. To do a quick Contains() we built a number of IEqualityComparer with each one checking ONE property. You need one for each property that is ORed in your equality check.
class WebIdComparer : IEqualityComparer<Product>
{
public bool Equals(Product x, Product y)
{
return Equals(x.WebId, y.WebId);
}
public int GetHashCode(Product obj)
{
unchecked
{
return obj.WebId.GetHashCode();
}
}
}
class UriComparer : IEqualityComparer<Product>
{
public bool Equals(Product x, Product y)
{
return Equals(x.Uri, y.Uri);
}
public int GetHashCode(Product obj)
{
unchecked
{
return obj.Uri.GetHashCode();
}
}
}
Then, create one hashtable per IEqualityComparer, passing in the comparer to the constructor. insert your collection into each hashtable, then for each item you want to test, do a contains() on each hashtable and OR the result. So For example:
var uriHashTable = new HashSet<Product>(existingProducts, new UriComparer());
var webIdHashTable = new HashSet<Product>(existingProducts, new WebIdComparer());
foreach (var newProduct in newProducts)
{
if (uriHashTable.Contains(newProduct) || webIdHashTable.Contains(newProduct))
//then it is equal to an existing product according to your equals implementation
}
Obviously this method suffers from using quite a bit more memory than the IEnumerable.Contains() method, needs more memory for every property that is ORed in your equals implementation.
回答3:
Does it fit in your program design to use a lamba inside the Contains
method call? It is the most straightforward way I can think of to achieve what you want.
if (hash.Contains(p => p.WedId == product2.WebId))
{
// "Contains" will now return TRUE because the WebId matches
}
来源:https://stackoverflow.com/questions/5176116/using-hashset-and-contains-to-return-true-if-one-or-many-fields-is-in-the-hash