问题
I have data in the following format:
4:How do you do? 10:Happy birthday 1:Purple monkey dishwasher 200:The Ancestral Territorial Imperatives of the Trumpeter Swan
The number can be anywhere from 1 to 999, and the string is at most 255 characters long. I'm new to C++ and it seems a few sources recommend extracting formatted data with a stream's >>
operator, but when I want to extract a string it stops at the first whitespace character. Is there a way to configure a stream to stop parsing a string only at a newline or end-of-file? I saw that there was a getline
method to extract an entire line, but then I still have to split it up manually [with find_first_of
], don't I?
Is there an easy way to parse data in this format using only STL?
回答1:
You can read the number before you use std::getline, which reads from a stream and stores into a std::string object. Something like this:
int num;
string str;
while(cin>>num){
getline(cin,str);
}
回答2:
The C++ String Toolkit Library (StrTk) has the following solution to your problem:
#include <string>
#include <deque>
#include "strtk.hpp"
int main()
{
struct line_type
{
unsigned int id;
std::string str;
};
std::deque<line_type> line_list;
const std::string file_name = "data.txt";
strtk::for_each_line(file_name,
[&line_list](const std::string& line)
{
line_type temp_line;
const bool result = strtk::parse(line,
":",
temp_line.id,
temp_line.str);
if (!result) return;
line_list.push_back(temp_line);
});
return 0;
}
More examples can be found Here
回答3:
You've already been told about std::getline
, but they didn't mention one detail that you'll probably find useful: when you call getline
, you can also pass a parameter telling it what character to treat as the end of input. To read your number, you can use:
std::string number;
std::string name;
std::getline(infile, number, ':');
std::getline(infile, name);
This will put the data up to the ':' into number
, discard the ':', and read the rest of the line into name
.
If you want to use >>
to read the data, you can do that too, but it's a bit more difficult, and delves into an area of the standard library that most people never touch. A stream has an associated locale
that's used for things like formatting numbers and (importantly) determining what constitutes "white space". You can define your own locale to define the ":" as white space, and the space (" ") as not white space. Tell the stream to use that locale, and it'll let you read your data directly.
#include <locale>
#include <vector>
struct colonsep: std::ctype<char> {
colonsep(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table() {
static std::vector<std::ctype_base::mask>
rc(std::ctype<char>::table_size,std::ctype_base::mask());
rc[':'] = std::ctype_base::space;
rc['\n'] = std::ctype_base::space;
return &rc[0];
}
};
Now to use it, we "imbue" the stream with a locale:
#include <fstream>
#include <iterator>
#include <algorithm>
#include <iostream>
typedef std::pair<int, std::string> data;
namespace std {
std::istream &operator>>(std::istream &is, data &d) {
return is >> d.first >> d.second;
}
std::ostream &operator<<(std::ostream &os, data const &d) {
return os << d.first << ":" << d.second;
}
}
int main() {
std::ifstream infile("testfile.txt");
infile.imbue(std::locale(std::locale(), new colonsep));
std::vector<data> d;
std::copy(std::istream_iterator<data>(infile),
std::istream_iterator<data>(),
std::back_inserter(d));
// just for fun, sort the data to show we can manipulate it:
std::sort(d.begin(), d.end());
std::copy(d.begin(), d.end(), std::ostream_iterator<data>(std::cout, "\n"));
return 0;
}
Now you know why that part of the library is so neglected. In theory, getting the standard library to do your work for you is great -- but in fact, most of the time it's easier to do this kind of job on your own instead.
回答4:
Just read the data line by line (whole line) using getline and parse it.
To parse use find_first_of()
回答5:
int i;
char *string = (char*)malloc(256*sizeof(char)); //since max is 255 chars, and +1 for '\0'
scanf("%d:%[^\n]s",&i, string); //use %255[^\n]s for accepting 255 chars max irrespective of input size
printf("%s\n", string);
Its C and will work in C++ too. scanf provides more control, but no error management. So use with caution :).
来源:https://stackoverflow.com/questions/2338827/reading-formatted-data-with-cs-stream-operator-when-data-has-spaces