How to shorten a 64-bit hash value down to a 48-bit value?

这一生的挚爱 提交于 2020-01-29 03:15:07

问题


I already have a 64 bit hash function in a library (C coding), but I only need 48 bits. I need to trim down the 64 bit hash value to a 48 bit value, yet it has to be in a safe manner in order to minimize collision.

The hash function is a very good 64 bit hash function. It has been tested with SMHasher (the "DieHarder" hash testing) and proved better than Murmur2. According to my colleagues, the algorithm implemented in the lib for 64-bit hashing is xxHash, tested with SMHasher and got a Q.Score of 10! For those who want to see it, the source code for xxHash is available on github.com : github.com/Cyan4973/xxHash/releases/latest.

The basic idea is to have all bits in the 64-bit hash value (or part of them) have an effect on the resulting 48-bit hash value. Is there any way to do that?

[Late EDIT]:
So I have implemented my own 48-bit (quasi)-UUID generator.
Please check a complete working solution (including source code) here: https://stackoverflow.com/a/47895889/4731718.


回答1:


If the 64-bit hash is good, then selecting any 48 bits will also be a good hash. @Lee Daniel. Of course, information is lost and not reversible.

unsigned long long Mask48 = 0xFFFFFFFFFFFFu;
unsigned long long hash48 = hash64 & Mask48;

If 64-bit hash function is weak, then mod by the largest prime just under pow(2,48). Some buckets will be lost. This will not harm a good hash, yet certainly make weak hashes better.

unsigned long long LargestPrime48 = 281474976710597u;  // FFFFFFFFFFC5
unsigned long long hash48 = hash64 % LargestPrime48;



回答2:


hash >>= 16;

But if you feel better arbitrarily preserving the other 16 bits just use XOR.

hash = (hash >> 16) ^ (hash & 0xFFFF);



回答3:


There exist no 48-bit hash algorithms as far as I know. Neither do 48-bit variable types exist, so this is a very strange design choice anyway.

And of course you can't shorten a 64-bit hash down to a 48-bit without loss and safe hashing is a completely different topic anyway. You could do something like using a common 32-bit hash function like CRC32 or so and just have 16 empty bits. Or even combining a 32-bit and 16-bit but that seems really really odd. From a collision-safe standpoint this isn't even a thing and I wouldn't want to hear the response of a cryptologically experienced person on this.

My recommendation: Use standard sized established hashing algorithms and don't make experiments. It's already hard enough to come up with a good hashing algorithm anyways. There's no need to become creative except you're an expert on your field and can handle the effects the change may have (which is probably the most difficult part).



来源:https://stackoverflow.com/questions/32912894/how-to-shorten-a-64-bit-hash-value-down-to-a-48-bit-value

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!