gcc complex constant folding

风流意气都作罢 提交于 2019-12-01 03:48:51

Here's a version using constexpr. It's slightly different from the others in one respect -- being recursive, it was easiest to hash the string back to front, so to speak. For example, the value it gives for "abc" will be what you'd normally expect from "cba" instead. I don't think this should make any real difference in use, as long as you use one or the other consistently (but given the vagaries of hashing, I could be wrong about that).

It does evaluate at compile time though -- for example, we can use the results as labels in a switch statement:

#include <iostream>

unsigned constexpr const_hash(char const *input) {
    return *input ?
           static_cast<unsigned>(*input) + 33 * const_hash(input + 1) :
           5381;
}

int main(int argc, char **argv) {
    switch (const_hash(argv[1])) {
    case const_hash("one"): std::cout << "one"; break;
    case const_hash("two"): std::cout << "two"; break;
    }
}

Obviously, there could be collisions, so you generally wouldn't want to use it as case statement labels -- I mostly did that to force a situation in which it would fail to compile if the result wasn't a compile-time constant.

Edit: if you care about the hash algorithm being "correct", I guess this is more accurate (with thanks to @Abyx):

unsigned constexpr const_hash(char const *input, unsigned hash = 5381) {
    return *input ?
        const_hash(input + 1, hash * 33 + static_cast<unsigned>(*input)): 
        hash;
}

The OP is interested in constant-folding in C, but just for its C++ sibling: in C++14, you can simply put constexpr in front of both functions, and modify the loop to to compensate for strlen() not being constexpr

#include<iostream>

static inline constexpr unsigned int DJBHash(const char *str)
{
   unsigned int hash = 5381;

   for(auto i = 0; i < 512; ++i) {
      if (*str == '\0') return hash;
      hash = ((hash << 5) + hash) + static_cast<unsigned int>(*str);   
   }

   return hash;
}

constexpr unsigned int f(void)
{   
    return DJBHash("01234567890123456");
}

int main()
{
    constexpr auto h = f(); 
    std::cout << std::hex << h << "\n"; // 88a7b505
}

Live Example using Clang 3.4 SVN with -std=c++1y.

NOTE: the current Clang implementation does not properly run with a while(*str != '\0'). Instead, a finite loop of 512 with a return condition inside does the job.

Perhaps C++ TMP might be able to do it. I'm not sure though.

It is possible if you don't mind using variadic character literal lists instead of string literals:

#include <type_traits>
#include <iostream>

template<unsigned acc, char... values>
struct DJBhash_helper
     : std::integral_constant<unsigned, acc> {};

template<unsigned acc, char head, char... tail>
struct DJBhash_helper<acc, head, tail...>
     : DJBhash_helper<(acc << 5) + acc + head, tail...> {};

template<char... str>
struct DJBhash
     : DJBhash_helper<5381, str...> {};

int main()
{
    std::cout << DJBhash<'0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
                         '0', '1', '2', '3', '4', '5', '6', '7'>::value << '\n';
}

ideone live demo

Not an answer, just another data point.

The following implementation is even worse. GCC 4.7.3 properly applies TCO to turn this implementation into a loop, but it only evaluates up to "0" at compile time!

static inline unsigned int DJBHash2(const char *str, unsigned int hash) {
   return *str ? DJBHash2(str + 1, 33 * hash + *str) : hash; }

On the plus side, the recursive version is 7 bytes shorter.

Someone else mentioned clang, so here are results for clang 3.1 -O3. It generates different code for the two versions of DJBHash, but they are the same number of bytes. Interestingly, it converts the shift and add from the original version into a multiply. It optimizes both versions down to constants for strings up to 100 characters. And finally, the clang code is 5 bytes shorter than the shortest GCC code.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!