What is the best hash method for an array of byte
?
The arrays are serialized class objects containing jpeg image passed between applications over TCP/IP
Jon Skeet has a good answer on how to override GetHashCode, which is based on general effective hash techniques where you start with a prime number, add it to the hash codes of the components multiplied by another prime number, allowing for overflow.
For your case, you would do:
static int GetByteArrayHashCode(byte[] array)
{
unchecked
{
int hash = 17;
// Cycle through each element in the array.
foreach (var value in array)
{
// Update the hash.
hash = hash * 23 + value.GetHashCode();
}
return hash;
}
}
Note in Jon's answer he goes into why this is better than XORing the hashes of the individual elements (and that anonymous types in C# currently do not XOR the hashes of the individual elements, but use something similar to the above).
While this will be faster than the hash algorithms from the System.Security.Cryptography namespace (because you are dealing with smaller hashes), the downside is that you might have more collisions.
You would have to test against your data and determine how often you are getting collisions vs. the work that needs to be done in the case of a collision.