It seems that gcc has some limitation on complex constant folding. Here is an example:
static inline unsigned int DJBHash(const char *str)
{
int i;
uns
Here's a version using constexpr
. It's slightly different from the others in one respect -- being recursive, it was easiest to hash the string back to front, so to speak. For example, the value it gives for "abc" will be what you'd normally expect from "cba" instead. I don't think this should make any real difference in use, as long as you use one or the other consistently (but given the vagaries of hashing, I could be wrong about that).
It does evaluate at compile time though -- for example, we can use the results as labels in a switch
statement:
#include <iostream>
unsigned constexpr const_hash(char const *input) {
return *input ?
static_cast<unsigned>(*input) + 33 * const_hash(input + 1) :
5381;
}
int main(int argc, char **argv) {
switch (const_hash(argv[1])) {
case const_hash("one"): std::cout << "one"; break;
case const_hash("two"): std::cout << "two"; break;
}
}
Obviously, there could be collisions, so you generally wouldn't want to use it as case statement labels -- I mostly did that to force a situation in which it would fail to compile if the result wasn't a compile-time constant.
Edit: if you care about the hash algorithm being "correct", I guess this is more accurate (with thanks to @Abyx):
unsigned constexpr const_hash(char const *input, unsigned hash = 5381) {
return *input ?
const_hash(input + 1, hash * 33 + static_cast<unsigned>(*input)):
hash;
}
Not an answer, just another data point.
The following implementation is even worse. GCC 4.7.3 properly applies TCO to turn this implementation into a loop, but it only evaluates up to "0" at compile time!
static inline unsigned int DJBHash2(const char *str, unsigned int hash) {
return *str ? DJBHash2(str + 1, 33 * hash + *str) : hash; }
On the plus side, the recursive version is 7 bytes shorter.
Someone else mentioned clang, so here are results for clang 3.1 -O3. It generates different code for the two versions of DJBHash, but they are the same number of bytes. Interestingly, it converts the shift and add from the original version into a multiply. It optimizes both versions down to constants for strings up to 100 characters. And finally, the clang code is 5 bytes shorter than the shortest GCC code.
Perhaps C++ TMP might be able to do it. I'm not sure though.
It is possible if you don't mind using variadic character literal lists instead of string literals:
#include <type_traits>
#include <iostream>
template<unsigned acc, char... values>
struct DJBhash_helper
: std::integral_constant<unsigned, acc> {};
template<unsigned acc, char head, char... tail>
struct DJBhash_helper<acc, head, tail...>
: DJBhash_helper<(acc << 5) + acc + head, tail...> {};
template<char... str>
struct DJBhash
: DJBhash_helper<5381, str...> {};
int main()
{
std::cout << DJBhash<'0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
'0', '1', '2', '3', '4', '5', '6', '7'>::value << '\n';
}
ideone live demo
The OP is interested in constant-folding in C, but just for its C++ sibling: in C++14, you can simply put constexpr
in front of both functions, and modify the loop to to compensate for strlen()
not being constexpr
#include<iostream>
static inline constexpr unsigned int DJBHash(const char *str)
{
unsigned int hash = 5381;
for(auto i = 0; i < 512; ++i) {
if (*str == '\0') return hash;
hash = ((hash << 5) + hash) + static_cast<unsigned int>(*str);
}
return hash;
}
constexpr unsigned int f(void)
{
return DJBHash("01234567890123456");
}
int main()
{
constexpr auto h = f();
std::cout << std::hex << h << "\n"; // 88a7b505
}
Live Example using Clang 3.4 SVN with -std=c++1y
.
NOTE: the current Clang implementation does not properly run with a while(*str != '\0')
. Instead, a finite loop of 512 with a return condition inside does the job.