Why does reading a record struct fields from std::istream fail, and how can I fix it?

后端 未结 9 2133
野性不改
野性不改 2020-11-21 11:40

Suppose we have the following situation:

  • A record struct is declared as follows

struct Person {
    unsigned int id;
    std::st         


        
相关标签:
9条回答
  • 2020-11-21 12:21

    When seeing such an input file, I think it is not a (new way) delimited file, but a good old fixed size fields one, like Fortran and Cobol programmers used to deal with. So I would parse it like that (note I separated forename and lastname) :

    #include <iostream>
    #include <fstream>
    #include <sstream>
    #include <string>
    #include <vector>
    
    struct Person {
        unsigned int id;
        std::string forename;
        std::string lastname;
        uint8_t age;
        // ...
    };
    
    int main() {
        std::istream& ifs = std::ifstream("file.txt");
        std::vector<Person> persons;
        std::string line;
        int fieldsize[] = {8, 9, 9, 4};
    
        while(std::getline(ifs, line)) {
            Person person;
            int field = 0, start=0, last;
            std::stringstream fieldtxt;
            fieldtxt.str(line.substr(start, fieldsize[0]));
            fieldtxt >> person.id;
            start += fieldsize[0];
            person.forename=line.substr(start, fieldsize[1]);
            last = person.forename.find_last_not_of(' ') + 1;
            person.forename.erase(last);
            start += fieldsize[1];
            person.lastname=line.substr(start, fieldsize[2]);
            last = person.lastname.find_last_not_of(' ') + 1;
            person.lastname.erase(last);
            start += fieldsize[2];
            std::string a = line.substr(start, fieldsize[3]);
            fieldtxt.str(line.substr(start, fieldsize[3]));
            fieldtxt >> age;
            person.age = person.age;
            persons.push_back(person);
        }
        return 0;
    }
    
    0 讨论(0)
  • 2020-11-21 12:22

    Another attempt at solving the parsing problem.

    int main()
    {
       std::ifstream ifs("test-115.in");
       std::vector<Person> persons;
    
       while (true)
       {
          Person actRecord;
          // Read the ID and the first part of the name.
          if ( !(ifs >> actRecord.id >> actRecord.name ) )
          {
             break;
          }
    
          // Read the rest of the line.
          std::string line;
          std::getline(ifs,line);
    
          // Pickup the rest of the name from the rest of the line.
          // The last token in the rest of the line is the age.
          // All other tokens are part of the name.
          // The tokens can be separated by ' ' or '\t'.
          size_t pos = 0;
          size_t iter1 = 0;
          size_t iter2 = 0;
          while ( (iter1 = line.find(' ', pos)) != std::string::npos ||
                  (iter2 = line.find('\t', pos)) != std::string::npos )
          {
             size_t iter = (iter1 != std::string::npos) ? iter1 : iter2;
             actRecord.name += line.substr(pos, (iter - pos + 1));
             pos = iter + 1;
    
             // Skip multiple whitespace characters.
             while ( isspace(line[pos]) )
             {
                ++pos;
             }
          }
    
          // Trim the last whitespace from the name.
          actRecord.name.erase(actRecord.name.size()-1);
    
          // Extract the age.
          // std::stoi returns an integer. We are assuming that
          // it will be small enough to fit into an uint8_t.
          actRecord.age = std::stoi(line.substr(pos).c_str());
    
          // Debugging aid.. Make sure we have extracted the data correctly.
          std::cout << "ID: " << actRecord.id
             << ", name: " << actRecord.name
             << ", age: " << (int)actRecord.age << std::endl;
          persons.push_back(actRecord);
       }
    
       // If came here before the EOF was reached, there was an
       // error in the input file.
       if ( !(ifs.eof()) ) {
           std::cerr << "Input format error!" << std::endl;
       } 
    }
    
    0 讨论(0)
  • 2020-11-21 12:31

    What can I do to read in the separate words forming the name into the one actRecord.name variable?

    The general answer is: No, you can't do this without additional delimiter specifications and exceptional parsing for the parts forming the intended actRecord.name contents.
    This is because a std::string field will be parsed just up to the next occurence of a whitespace character.

    It's noteworthy that some standard formats (like e.g. .csv) may require to support distinguishing blanks (' ') from tab ('\t') or other characters, to delimit certain record fields (which may not be visible at a first glance).

    Also note:
    To read an uint8_t value as numeric input, you'll have to deviate using a temporary unsigned intvalue. Reading just a unsigned char (aka uint8_t) will screw up the stream parsing state.

    0 讨论(0)
提交回复
热议问题