Reading Binary File into a Structure (C++)

后端 未结 5 864
遇见更好的自我
遇见更好的自我 2021-02-04 15:42

So I\'m having a bit of an issue of not being able to properly read a binary file into my structure. The structure is this:

struct Student
{
    char name[25];
         


        
相关标签:
5条回答
  • 2021-02-04 16:13

    Without seeing the code that writes the data, I'm guessing that you write the data the way you read it in the first example, each element one by one. Then each record in the file will indeed be 37 bytes.

    However, since the compiler pads structures to put members on nice boundaries for optimization reasons, your structure is 40 bytes. So when you read the complete structure in a single call, then you actually read 40 bytes at a time, which means that your reading will go out of phase with the actual records in the file.

    You either have to re-implement the writing to write the complete structure in one go, or use the first method of reading where you're reading one member field at a time.

    0 讨论(0)
  • 2021-02-04 16:21

    A simple workaround is to pack your structure to 1 byte

    using gcc

    struct __attribute__((packed)) Student
    {
        char name[25];
        int quiz1;
        int quiz2;
        int quiz3;
    };
    

    using msvc

    #pragma pack(push, 1) //set padding to 1 byte, saves previous value
    struct  Student
    {
        char name[25];
        int quiz1;
        int quiz2;
        int quiz3;
    };
    #pragma pack(pop) //restore previous pack value
    

    EDIT : As user ahans states : pragma pack is supported by gcc since version 2.7.2.3 (released in 1997) so it seems safe to use pragma pack as the only packed notation if you are targetting msvc and gcc

    0 讨论(0)
  • 2021-02-04 16:22

    There is more than one way to solve the problem of this thread. Here is a solution based on using union of a struct and a char buf:

    #include <fstream>
    #include <sstream>
    #include <iomanip>
    #include <string>
    
    /*
    This is the main idea of the technique: Put the struct
    inside a union. And then put a char array that is the
    number of chars needed for the array.
    
    union causes sStudent and buf to be at the exact same
    place in memory. They overlap each other!
    */
    union uStudent
    {
        struct sStudent
        {
            char name[25];
            int quiz1;
            int quiz2;
            int quiz3;
        } field;
    
        char buf[ sizeof(sStudent) ];    // sizeof calcs the number of chars needed
    };
    
    void create_data_file(fstream& file, uStudent* oStudent, int idx)
    {
        if (idx < 0)
        {
            // index passed beginning of oStudent array. Return to start processing.
            return;
        }
    
        // have not yet reached idx = -1. Tail recurse
        create_data_file(file, oStudent, idx - 1);
    
        // write a record
        file.write(oStudent[idx].buf, sizeof(uStudent));
    
        // return to write another record or to finish
        return;
    }
    
    
    std::string read_in_data_file(std::fstream& file, std::stringstream& strm_buf)
    {
        // allocate a buffer of the correct size
        uStudent temp_student;
    
        // read in to buffer
        file.read( temp_student.buf, sizeof(uStudent) );
    
        // at end of file?
        if (file.eof())
        {
            // finished
            return strm_buf.str();
        }
    
        // not at end of file. Stuff buf for display
        strm_buf << std::setw(25) << std::left << temp_student.field.name;
        strm_buf << std::setw(5) << std::right << temp_student.field.quiz1;
        strm_buf << std::setw(5) << std::right << temp_student.field.quiz2;
        strm_buf << std::setw(5) << std::right << temp_student.field.quiz3;
        strm_buf << std::endl;
    
        // head recurse and see whether at end of file
        return read_in_data_file(file, strm_buf);
    }
    
    
    
    std::string quiz(void)
    {
    
        /*
        declare and initialize array of uStudent to facilitate
        writing out the data file and then demonstrating
        reading it back in.
        */
        uStudent oStudent[] =
        {
            {"Bart Simpson",          75,   65,   70},
            {"Ralph Wiggum",          35,   60,   44},
            {"Lisa Simpson",         100,   98,   91},
            {"Martin Prince",         99,   98,   99},
            {"Milhouse Van Houten",   80,   87,   79}
    
        };
    
    
    
    
        fstream file;
    
        // ios::trunc causes the file to be created if it does not already exist.
        // ios::trunc also causes the file to be empty if it does already exist.
        file.open("quizzes.dat", ios::in | ios::out | ios::binary | ios::trunc);
    
        if ( ! file.is_open() )
        {
            ShowMessage( "File did not open" );
            exit(1);
        }
    
    
        // create the data file
        int num_elements = sizeof(oStudent) / sizeof(uStudent);
        create_data_file(file, oStudent, num_elements - 1);
    
        // Don't forget
        file.flush();
    
        /*
        We wrote actual integers. So, you cannot check the file so
        easily by just using a common text editor such as Windows Notepad.
    
        You would need an editor that shows hex values or something similar.
        And integrated development invironment (IDE) is likely to have such
        an editor.   Of course, not always so.
        */
    
    
        /*
        Now, read the file back in for display. Reading into a string buffer
        for display all at once. Can modify code to display the string buffer
        wherever you want.
        */
    
        // make sure at beginning of file
        file.seekg(0, ios::beg);
    
        std::stringstream strm_buf;
        strm_buf.str( read_in_data_file(file, strm_buf) );
    
        file.close();
    
        return strm_buf.str();
    }
    

    Call quiz() and receive a string formatted for display to std::cout, writing to a file, or whatever.

    The main idea is that all the items inside a union start at the same address in memory. So you can have a char or wchar_t buf that is the same size as the struct you want to write to or read from a file. And notice that zero casts are needed. There is not one cast in the code.

    I also did not have to worry about padding.

    For those who do not like recursion, sorry. Working it out with recursion is easier and less error prone for me. Maybe not easier for others? The recursions can be converted to loops. And they would need to be converted to loops for very large files.

    For those who like recursions, this is yet another instance of using recursion.

    I don't claim that using union is the best solution or not. Seems that it is a solution. Maybe you like it?

    0 讨论(0)
  • 2021-02-04 16:23

    As you've already found out, the padding is the issue here. Also, as others have suggested, the proper way of solving this is to read each member individually as you've done in your example. I don't expect this to cost much more than reading the whole thing in once performance-wise. However, if you still want to go ahead and read it as once, you can tell the compiler to do the padding differently:

    #pragma pack(push, 1)
    struct Student
    {
        char name[25];
        int quiz1;
        int quiz2;
        int quiz3;
    };
    #pragma pack(pop)
    

    With #pragma pack(push, 1) you tell the compiler to save the current pack value on an internal stack and use a pack value of 1 thereafter. This means you get an alignment of 1 byte, which means no padding at all in this case. With #pragma pack(pop) you tell the compiler to get the last value from the stack and use this thereafter, thereby restoring the behavior the compiler used before the definition of your struct.

    While #pragma usually indicates non-portable, compiler-dependent features, this one works at least with GCC and Microsoft VC++.

    0 讨论(0)
  • 2021-02-04 16:27

    Your struct has almost certainly been padded to preserve the alignment of its content. This means that it will not be 37 bytes, and that mismatch causes the reading to go out of sync. Looking at the way each string is losing 3 characters, it seems that it has been padded to 40 bytes.

    As the padding is likely to be between the string and the integers, not even the first record reads correctly.

    In this case I would recommend not attempting to read your data as a binary blob, and stick to reading individual fields. It's far more robust, especially if you even want to alter your structure.

    0 讨论(0)
提交回复
热议问题