Uniquely identifying URLs with one 64-bit number
问题 This is basically a math problem, but very programing related: if I have 1 billion strings containing URLs, and I take the first 64 bits of the MD5 hash of each of them, what kind of collision frequency should I expect? How does the answer change if I only have 100 million URLs? It seems to me that collisions will be extremely rare, but these things tend to be confusing. Would I be better off using something other than MD5? Mind you, I'm not looking for security, just a good fast hash