Parse (split) a string in C++ using string delimiter (standard C++)

后端 未结 20 2105
时光说笑
时光说笑 2020-11-21 23:44

I am parsing a string in C++ using the following:

using namespace std;

string parsed,input=\"text to be parsed\";
stringstream input_stringstream(input);

i         


        
相关标签:
20条回答
  • 2020-11-21 23:52

    You can use the std::string::find() function to find the position of your string delimiter, then use std::string::substr() to get a token.

    Example:

    std::string s = "scott>=tiger";
    std::string delimiter = ">=";
    std::string token = s.substr(0, s.find(delimiter)); // token is "scott"
    
    • The find(const string& str, size_t pos = 0) function returns the position of the first occurrence of str in the string, or npos if the string is not found.

    • The substr(size_t pos = 0, size_t n = npos) function returns a substring of the object, starting at position pos and of length npos.


    If you have multiple delimiters, after you have extracted one token, you can remove it (delimiter included) to proceed with subsequent extractions (if you want to preserve the original string, just use s = s.substr(pos + delimiter.length());):

    s.erase(0, s.find(delimiter) + delimiter.length());
    

    This way you can easily loop to get each token.

    Complete Example

    std::string s = "scott>=tiger>=mushroom";
    std::string delimiter = ">=";
    
    size_t pos = 0;
    std::string token;
    while ((pos = s.find(delimiter)) != std::string::npos) {
        token = s.substr(0, pos);
        std::cout << token << std::endl;
        s.erase(0, pos + delimiter.length());
    }
    std::cout << s << std::endl;
    

    Output:

    scott
    tiger
    mushroom
    
    0 讨论(0)
  • 2020-11-21 23:52
    std::vector<std::string> split(const std::string& s, char c) {
      std::vector<std::string> v;
      unsigned int ii = 0;
      unsigned int j = s.find(c);
      while (j < s.length()) {
        v.push_back(s.substr(i, j - i));
        i = ++j;
        j = s.find(c, j);
        if (j >= s.length()) {
          v.push_back(s.substr(i, s,length()));
          break;
        }
      }
      return v;
    }
    
    0 讨论(0)
  • 2020-11-21 23:55

    If you do not want to modify the string (as in the answer by Vincenzo Pii) and want to output the last token as well, you may want to use this approach:

    inline std::vector<std::string> splitString( const std::string &s, const std::string &delimiter ){
        std::vector<std::string> ret;
        size_t start = 0;
        size_t end = 0;
        size_t len = 0;
        std::string token;
        do{ end = s.find(delimiter,start); 
            len = end - start;
            token = s.substr(start, len);
            ret.emplace_back( token );
            start += len + delimiter.length();
            std::cout << token << std::endl;
        }while ( end != std::string::npos );
        return ret;
    }
    
    0 讨论(0)
  • 2020-11-21 23:55

    As a bonus, here is a code example of a split function and macro that is easy to use and where you can choose the container type :

    #include <iostream>
    #include <vector>
    #include <string>
    
    #define split(str, delim, type) (split_fn<type<std::string>>(str, delim))
     
    template <typename Container>
    Container split_fn(const std::string& str, char delim = ' ') {
        Container cont{};
        std::size_t current, previous = 0;
        current = str.find(delim);
        while (current != std::string::npos) {
            cont.push_back(str.substr(previous, current - previous));
            previous = current + 1;
            current = str.find(delim, previous);
        }
        cont.push_back(str.substr(previous, current - previous));
        
        return cont;
    }
    
    int main() {
        
        auto test = std::string{"This is a great test"};
        auto res = split(test, ' ', std::vector);
        
        for(auto &i : res) {
            std::cout << i << ", "; // "this", "is", "a", "great", "test"
        }
        
        
        return 0;
    }
    
    0 讨论(0)
  • 2020-11-21 23:58

    This is a complete method that splits the string on any delimiter and returns a vector of the chopped up strings.

    It is an adaptation from the answer from ryanbwork. However, his check for: if(token != mystring) gives wrong results if you have repeating elements in your string. This is my solution to that problem.

    vector<string> Split(string mystring, string delimiter)
    {
        vector<string> subStringList;
        string token;
        while (true)
        {
            size_t findfirst = mystring.find_first_of(delimiter);
            if (findfirst == string::npos) //find_first_of returns npos if it couldn't find the delimiter anymore
            {
                subStringList.push_back(mystring); //push back the final piece of mystring
                return subStringList;
            }
            token = mystring.substr(0, mystring.find_first_of(delimiter));
            mystring = mystring.substr(mystring.find_first_of(delimiter) + 1);
            subStringList.push_back(token);
        }
        return subStringList;
    }
    
    0 讨论(0)
  • 2020-11-21 23:59

    I would use boost::tokenizer. Here's documentation explaining how to make an appropriate tokenizer function: http://www.boost.org/doc/libs/1_52_0/libs/tokenizer/tokenizerfunction.htm

    Here's one that works for your case.

    struct my_tokenizer_func
    {
        template<typename It>
        bool operator()(It& next, It end, std::string & tok)
        {
            if (next == end)
                return false;
            char const * del = ">=";
            auto pos = std::search(next, end, del, del + 2);
            tok.assign(next, pos);
            next = pos;
            if (next != end)
                std::advance(next, 2);
            return true;
        }
    
        void reset() {}
    };
    
    int main()
    {
        std::string to_be_parsed = "1) one>=2) two>=3) three>=4) four";
        for (auto i : boost::tokenizer<my_tokenizer_func>(to_be_parsed))
            std::cout << i << '\n';
    }
    
    0 讨论(0)
提交回复
热议问题