I need to convert strings to some form of hash. Is this possible in JavaScript?
I\'m not utilizing a server-side language so I can\'t do it that way.
About half of the answers here are the same String.hashCode
hash function taken from Java. It dates back to 1981 from Gosling Emacs, is extremely weak, and makes zero sense performance-wise in modern JavaScript. In fact, implementations could be significantly faster by using ES6 Math.imul
, but no one took notice. We can do much better than this, at essentially identical performance.
Here's something I did—cyrb53, a simple but high quality 53-bit hash. It's quite fast, provides very good hash distribution, and has significantly lower collision rates compared to any 32-bit hash.
const cyrb53 = function(str, seed = 0) {
let h1 = 0xdeadbeef ^ seed, h2 = 0x41c6ce57 ^ seed;
for (let i = 0, ch; i < str.length; i++) {
ch = str.charCodeAt(i);
h1 = Math.imul(h1 ^ ch, 2654435761);
h2 = Math.imul(h2 ^ ch, 1597334677);
}
h1 = Math.imul(h1 ^ (h1>>>16), 2246822507) ^ Math.imul(h2 ^ (h2>>>13), 3266489909);
h2 = Math.imul(h2 ^ (h2>>>16), 2246822507) ^ Math.imul(h1 ^ (h1>>>13), 3266489909);
return 4294967296 * (2097151 & h2) + (h1>>>0);
};
It is similar to the well-known MurmurHash/xxHash algorithms, it uses a combination of multiplication and Xorshift to generate the hash, but not as thorough. As a result it's faster than either in JavaScript and significantly simpler to implement.
Like any proper hash, it has an avalanche effect, which basically means small changes in the input have big changes in the output making the resulting hash appear more 'random':
"501c2ba782c97901" = cyrb53("a")
"459eda5bc254d2bf" = cyrb53("b")
"fbce64cc3b748385" = cyrb53("revenge")
"fb1d85148d13f93a" = cyrb53("revenue")
You can also supply a seed for alternate streams of the same input:
"76fee5e6598ccd5c" = cyrb53("revenue", 1)
"1f672e2831253862" = cyrb53("revenue", 2)
"2b10de31708e6ab7" = cyrb53("revenue", 3)
Technically, it is a 64-bit hash, that is, two uncorrelated 32-bit hashes computed in parallel, but JavaScript is limited to 53-bit integers. If convenient, the full 64-bit output can be used by altering the return statement with a hex string or array.
return [h2>>>0, h1>>>0];
// or
return (h2>>>0).toString(16).padStart(8,0)+(h1>>>0).toString(16).padStart(8,0);
Be aware that constructing hex strings drastically slows down batch processing. The array is more efficient, but obviously requires two checks instead of one.
Just for fun, here's the smallest hash I could come up with that's still decent. It's a 32-bit hash in 89 chars with better quality randomness than even FNV or DJB2:
TSH=s=>{for(var i=0,h=9;i>>9}