问题
My instructor dumped this on us, and told us we just needed to google how to write a hash function. I am quite directionless on this. We wrote a basic Hash Table template for class, but I have a project due that requires ~160,000 strings to be sorted into a table with at least 500 buckets (I am wanting to do more for speed).
I just have no idea where to look to get concise, easily digestible information on this.
Any help would be greatly appreciated.
回答1:
I suggest a universal hash function. This kind of function guarantees a small number of collisions in expectation, even if the data is chosen by an adversary. There are plenty of universal hash functions.
In case of strings, you can adopt the following hash function.
For a character c we define #(c) the arithmetic value of c ie(ASCII). For a string x=c1c1...cn
we define
If HSize is an integer and k a big prime number (you define it), for a range 0<a,b<k*HSize
let the hash function be:
This function provides output between [0, HSize-1]
The output value is calculated by horner's rule, finding the modulo of the k*HSize
division after every operation.
So, create a function HashFunction and pass the string you want to hash as a parameter. Here is the code:
#define k 7919
#define Hsize 1009
#define a 321
#define b 43112
long long HashFunction(string text)
{
int i;
long long res = 0;
long long M = (Hsize * k);
cout<<"M = "<<M<<endl;
cout<<"Hsize = "<<Hsize<<endl;
cout<<"k = "<<k<<endl;
int s=text.size();
for(i = s-1; i >= 0; i--)
{
res = a * (res * 256 + (int)text[i]);
//cout<<"res before modulo = "<<res<<endl;
res=res % M;
//cout<<"res after modulo = "<<res<<endl;
}
long long res1 = (res + b) / k;
return res1;
}
来源:https://stackoverflow.com/questions/19877203/i-need-some-direction-on-writing-a-hash-function-to-sort-160-000-strings