How to use boost split to split a string and ignore empty values?

前端 未结 3 1704
遇见更好的自我
遇见更好的自我 2021-02-04 02:55

I am using boost::split to parse a data file. The data file contains lines such as the following.

data.txt

1:1~15  ASTKGPSVFPLAPSS SVFPLAPSS   -12.6   98         


        
相关标签:
3条回答
  • 2021-02-04 03:30

    I would recommend using C++ String Toolkit Library. This library is much faster than Boost in my opinion. I used to use Boost to split (aka tokenize) a line of text but found this library to be much more in line with what I want.

    One of the great things about strtk::parse is its conversion of tokens into their final value and checking the number of elements.

    you could use it as so:

    std::vector<std::string> tokens;
    
    // multiple delimiters should be treated as one
    if( !strtk::parse( dataLine, "\t", tokens ) )
    {
        std::cout << "failed" << std::endl;
    }
    

    --- another version

    std::string token1;
    std::string token2;
    std::string token3:
    float value1;
    float value2;
    
    if( !strtk::parse( dataLine, "\t", token1, token2, token3, value1, value2) )
    {
         std::cout << "failed" << std::endl;
         // fails if the number of elements is not what you want
    }
    

    Online documentation for the library: String Tokenizer Documentation Link to the source code: C++ String Toolkit Library

    0 讨论(0)
  • 2021-02-04 03:44

    Even though "adjacent separators are merged together", it seems like the trailing delimeters make the problem, since even when they are treated as one, it still is one delimeter.

    So your problem cannot be solved with split() alone. But luckily Boost String Algo has trim() and trim_if(), which strip whitespace or delimeters from beginning and end of a string. So just call trim() on buf, like this:

    std::string buf = "1:1~15  ASTKGPSVFPLAPSS SVFPLAPSS   -12.6   98.3    ";
    std::vector<std::string> dataLine;
    boost::trim_if(buf, boost::is_any_of("\t ")); // could also use plain boost::trim
    boost::split(dataLine, buf, boost::is_any_of("\t "), boost::token_compress_on);
    std::cout << out.size() << std::endl;
    

    This question was already asked: boost::split leaves empty tokens at the beginning and end of string - is this desired behaviour?

    0 讨论(0)
  • 2021-02-04 03:47

    Leading and trailing whitespace is intentionally left alone by boost::split because it does not know if it is significant or not. The solution is to use boost::trim before calling boost::split.

    #include <boost/algorithm/string/trim.hpp>
    
    ....
    
    boost::trim(buf);
    
    0 讨论(0)
提交回复
热议问题