A string tokenizer in C++ that allows multiple separators

后端 未结 3 704
陌清茗
陌清茗 2021-02-15 16:45

Is there a way to tokenize a string in C++ with multiple separators? In C# I would have done:

string[] tokens = \"adsl, dkks; dk\".Split(new [] { \",\", \" \", \         


        
相关标签:
3条回答
  • 2021-02-15 17:03

    Something like that will do:

    void tokenize_string(const std::string &original_string, const std::string &delimiters, std::vector<std::string> *tokens)
    {
            if (NULL == tokens) return;
    
            size_t pos_start = original_string.find_first_not_of(delimiters);
            size_t pos_end   = original_string.find_first_of(delimiters, pos_start);
    
            while (std::string::npos != pos_start)
            {
                    tokens->push_back(original_string.substr(pos_start, pos_end - pos_start));
                    pos_start = original_string.find_first_not_of(delimiters, pos_end);
                    pos_end   = original_string.find_first_of(delimiters, pos_start);
            }
    }
    
    0 讨论(0)
  • 2021-02-15 17:04

    Use boost::tokenizer. It supports multiple separators.

    In fact, you don't really even need boost::tokenizer. If all you want is a split, use boost::split. The documentation has an example: http://www.boost.org/doc/libs/1_42_0/doc/html/string_algo/usage.html#id1718906

    0 讨论(0)
  • 2021-02-15 17:17

    Here is my version (not heavily tested (yet)):

    std::vector<std::string> split(std::string const& s,
        std::vector<std::string> const& delims)
    {
        std::vector<std::string> parts;
    
        std::vector<std::pair<std::string::size_type, std::string::size_type>> poss;
        poss.reserve(delims.size());
    
        std::string::size_type beg = 0;
    
        for(;;)
        {
            poss.clear();
    
            std::string::size_type idx = 0;
            for(auto const& delim: delims)
            {
                if(auto end = s.find(delim, beg) + 1)
                    poss.emplace_back(end - 1, idx);
                ++idx;
            }
    
            if(poss.empty())
                break;
    
            std::sort(std::begin(poss), std::end(poss));
    
            auto old_beg = beg;
    
            for(auto pos: poss)
            {
                parts.emplace_back(std::begin(s) + beg,
                    std::begin(s) + old_beg + pos.first);
                beg = pos.first + delims[pos.second].size();
            }
        }
    
        if(beg < s.size())
            parts.emplace_back(std::begin(s) + beg, std::end(s));
    
        return parts;
    }
    
    0 讨论(0)
提交回复
热议问题