What's a good hash function for struct with 3 unsigned chars and an int, for unordered_map?

可紊 提交于 2019-12-11 18:09:56

问题


I just want to use a unordered_map with my struct as key, since I dont need any ordering..but I just cant find myself with all that hash stuff..

As a side relevant question..When ppl compare unordered and ordered map they never talk about the hash function, how can that be? Cant a bad hash function makes unordered map slower than map? (solely due the hash function)

struct exemple{

  unsigned char a,b,c;
  unsigned int n;

  bool operator == ( const exemple & other) const {..}
};

namespace std {
template <>
struct hash<exemple> : public std::unary_function<const exemple &, std::size_t>
{
    inline std::size_t operator()(const exemple & exemple_p ) const
    {
        return 0;// what do I do
    }
};

}

-edit- a,b,c can have only the values 'a', 'b', 'c' or 'd', and n varies ~ 3 to 60.


回答1:


What you do in your hash function depends on the values you got, not necessarily so much on their types. If all four data members contain each value evenly distributed, I would combine the two characters into an unsigned long and return the result of xoring the two values:

typedef unsigned long ulong;
return n ^ (ulong(a << 16) | ulong(b << 8) | ulong(c));

It is certainly a hash function. Whether it is one which works well is a different question. You might also combine the result with std::hash<unsigned long>.




回答2:


Here's a baseline hash function:

unsigned long long h = (n << 24) | (a << 16) | (b << 8) | c;
return std::hash(h);

I.e., just pack the members into an unsigned long long, then offload the work to std::hash. In the common case that int is 32 bits wide and long long is 64 bits, and assuming your chars are not negative, this uses all the information in your objects for the hash.




回答3:


Consider your struct as a whole to be a string of bytes (7, to be precise). You may use any acceptably general string hash function upon those 7 bytes. Here is the FNV (Fowler/Noll/Vo) general bit-string hash function applied to your example (within the given hash functor class):

inline std::size_t operator()(const exemple& obj ) const
{
  const unsigned char* p = reinterpret_cast<const unsigned char*>( &obj );
  std::size_t h = 2166136261;

  for (unsigned int i = 0; i < sizeof(obj); ++i)
    h = (h * 16777619) ^ p[i];

  return h;
}

Note how I converted the reference to the exemple structure (obj) to a pointer to const unsigned char so that I could access the bytes of the structure one-by-one, and I treat it as an opaque binary object. Note that sizeof(obj) may actually be 8 rather than 7 depending upon the compiler's padding (which would mean there's a garbage padding byte somewhere in the structure, probably between c and n. If you wanted, you could rewrite the hash function to iterate over a, b, and c and then the bytes of n in order (or any order), which would eliminate the influence of any padding bytes (which may or may not exist) upon the hash of your struct.

Yes, a bad hash function can make unordered_map slower than ordered_map. This isn't always discussed, because generalized, fast algorithms like the FNV hash given above are assumed to be used by those using unordered_map, and in those cases, generally an unordered_map is faster than an ordered_map at the expense of the ability to iterate over the container's elements in order. However, yes, you must being using a good hash function for your data, and usually it's good enough to use one of these well-known hashes. Ultimately, however, every hash function has its weaknesses depending upon the input data's (here, the contents of the exemple structure) distribution.

A good discussion of generalized hashing and example hashing functions can be found at Eternally Confuzzled, including a C-style FNV hash similar to the one which I've given you.




回答4:


boost::hash_combine is designed for this purpose:

std::size_t hash = 0;
for (const auto& value : {a, b, c}) {
    boost::hash_combine(hash, value);
}
boost::hash_combine(hash, n);
return hash;


来源:https://stackoverflow.com/questions/13389631/whats-a-good-hash-function-for-struct-with-3-unsigned-chars-and-an-int-for-uno

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!