I am trying to split a string using spaces as a delimiter. I would like to store each token in an array or vector.
I have tried.
string tempInput
Go to the duplicate questions to learn how to split a string into words, but your method is actually correct. The actual problem lies in how you are reading the input before trying to split it:
string tempInput;
cin >> tempInput; // !!!
When you use the cin >> tempInput
, you are only getting the first word from the input, not the whole text. There are two possible ways of working your way out of that, the simplest of which is forgetting about the stringstream
and directly iterating on input:
std::string tempInput;
std::vector< std::string > tokens;
while ( std::cin >> tempInput ) {
tokens.push_back( tempInput );
}
// alternatively, including algorithm and iterator headers:
std::vector< std::string > tokens;
std::copy( std::istream_iterator<std::string>( std::cin ),
std::istream_iterator<std::string>(),
std::back_inserter(tokens) );
This approach will give you all the tokens in the input in a single vector. If you need to work with each line separatedly then you should use getline
from the <string>
header instead of the cin >> tempInput
:
std::string tempInput;
while ( getline( std::cin, tempInput ) ) { // read line
// tokenize the line, possibly with your own code or
// any answer in the 'duplicate' question
}
Notice that it’s much easier just to use copy
:
vector<string> tokens;
copy(istream_iterator<string>(cin),
istream_iterator<string>(),
back_inserter(tokens));
As for why your code doesn’t work: you’re reusing tempInput
. Don’t do that. Furthermore, you’re first reading a single word from cin
, not the whole string. That’s why only a single word is put into the stringstream
.
The easiest way: Boost.Tokenizer
std::vector<std::string> tokens;
std::string s = "This is, a test";
boost::tokenizer<> tok(s);
for(boost::tokenizer<>::iterator it=tok.begin(); it != tok.end(); ++it)
{
tokens.push_back(*it);
}
// tokens is ["This", "is", "a", "test"]
You can parameter the delimiters and escape sequences to only take spaces if you wish, by default it tokenize on both spaces and punctuation.