C++ unordered_map lookup without constructing string

前端 未结 1 572
时光取名叫无心
时光取名叫无心 2021-01-18 02:50

I have C++ code that investigates a BIG string and matches lots of substrings. As much as possible, I avoid constructing std::strings, by encoding substrings like this:

相关标签:
1条回答
  • 2021-01-18 03:44

    What you're wanting to do is called heterogeneous lookup. Since C++14 it's been supported for std::map::find and std::set::find (note versions (3) and (4) of the functions, which are templated on the lookup value type). It's more complicated for unordered containers because they need to be told of or find hash functions for all key types that will produce the same hash value for the same text. There's a proposal under consideration for a future Standard: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0919r0.html

    Meanwhile, you could use another library that already supports heterogenous lookup, e.g. boost::unordered_map::find.

    If you want to stick to std::unordered_map, you could avoid creating so many string temporaries by storing a std::string member alongside your unordered_map that you can reassign values to, then pass that string to find. You could encapsulate this in a custom container class.

    Another route is to write a custom class to use as your unordered container key:

    struct CharPtrOrString
    {
        const char* p_;
        std::string s_;
    
        explicit CharPtrOrString(const char* p) : p_{p} { }
        CharPtrOrString(std::string s) : p_{nullptr}, s_{std::move(s)} { }
    
        bool operator==(const CharPtrOrString& x) const
        {
            return p_ ? x.p_ ? std::strcmp(p_, x.p_) == 0
                             : p_ == x.s_
                      : x.p_ ? s_ == x.p_
                             : s_ == x.s_;
        }
    
        struct Hash
        {
            size_t operator()(const CharPtrOrString& x) const
            {
                std::string_view sv{x.p_ ? x.p_ : x.s_.c_str()};
                return std::hash<std::string_view>()(sv);
            } 
        };
    };
    

    You can then construct CharPtrOrString from std::strings for use in the unordered container keys, but construct one cheaply from your const char* each time you call find. Note that operator== above has to work out which you did (convention used is that if the pointer's nullptr then the std::string member's in use) so it compares the in-use members. The hash function has to make sure a std::string with a particular textual value will produce the same hash as a const char* (which it doesn't by default with GCC 7.3 and/or Clang 6 - I work with both and remember one had an issue but not which).

    0 讨论(0)
提交回复
热议问题