问题
I want to read double
values from a binary
file and store them in a vector. My values have the following form: 73.6634, 73.3295, 72.6764 and so on. I have this code that read and store data in memory. It works perfectly with char
types since the read
function has as input a char
type (istream& read (char* s, streamsize n)
). When I try to convert char
type to double
I get obviously integer values as 74, 73, 73 and so on. Is there any function which allows me to read directly double values or any other way of doing that?
If I change char * memblock
to double * memblock
and memblock = new char[]
to memblock = new double[]
, I get errors when compiling because again read
function can only have char
type input variable...
Thanks, I will appreciate your help :)
// reading an entire binary file
#include <iostream>
#include <fstream>
using namespace std;
int main () {
streampos size;
char * memblock;
int i=0;
ifstream file ("example.bin", ios::in|ios::binary|ios::ate);
if (file.is_open())
{
size = file.tellg();
cout << "size=" << size << "\n";
memblock = new char [size];
file.seekg (0, ios::beg);
file.read (memblock, size);
file.close();
cout << "the entire file content is in memory \n";
for(i=0; i<=10; i++)
{
double value = memblock [i];
cout << "value ("<<i<<")=" << value << "\n";
}
};
delete[] memblock;
}
else cout << "Unable to open file";
return 0;
}
回答1:
(sorry about the "Like I'm 5" tone, I have no idea how much you know or don't)
Intro Binary Data
As you probably know, your computer doesn't think about numbers the way you do.
To start, the computer thinks about all numbers in a "base 2" system. But it doesn't stop there. Your computer also associates a fixed size to all the numbers. It creates a fixed "width" of the numbers. This size is (almost always) in bytes, or groups of 4 digits. This is (pretty close to) the equivalent of, when you do math on the numbers [1,15,30002] you look at all the numbers as
[
00000001
00000015
00030002
]
(doubles are a little weirder, but I'll get to that in a second).
Lets pretend for demonstrative purposes that each 2 characters above represent a single byte of data. This means that, in the computer, it thinks about the numbers like this:
[
00,00,00,01
00,00,00,15
00,03,00,02
]
File IO is all done along a "byte"(char) size: it typically has no idea what it is reading. It is up to YOU to figure that out. When writing binary data to a file (from an array atleast) we just dump it all. So in the example above, if we write it all to the file like this:
[00,00,00,01,00,00,00,15,00,03,00,02]
But you'll have to reinterpret it, back into the type of 4 bytes.
Luckily, this is stupidly easy to do in c++:
size = file.tellg();
cout << "size=" << size << "\n";
memblock = new char [size];
file.seekg (0, ios::beg);
file.read (memblock, size);
file.close();
cout << "the entire file content is in memory \n";
double* double_values = (double*)memblock;//reinterpret as doubles
for(i=0; i<=10; i++)
{
double value = double_values[i];
cout << "value ("<<i<<")=" << value << "\n";
}
What this basically does is say, interpret those bytes (char) as double.
edit: Endianness
Endiannessis (again, LI5) the order of which the computer writes the number. You are used to fifteen being written left to right (25, twenty-five) but it would be just as valid to write the number from right to left (52, five-twenty). We have big-endian (Most Significan Byte at lowest address) and little-endian (MSB at highest address).
This was never standardized between architectures or virtual machines...but if they disagree you can get weird results.
A special case: doubles
Not really in line with your question, but I have to point out that doubles are a special case: while reading and writing looks the same, the underlying data isn't just a simple number. I like to think of doubles as the "scientific notation" of computers. The double standard uses a base and power to get your number. in the same amount of space as a long it stores (sign)(a^x). This gives a much larger dynamic range of representation of the values, BUT you loose a certain sense of "human readability" of the bytes, and you get the SAME number of values so you can loose precision (though its relative precision, just like scientific notation, so you may not be able to distinguish from a billion and 1 from a billion and 2, but that 1 and 2 are TINY compared to the number).
writing data in C++
We might as well point out one quirk of C++: you gotta make sure when you write the data, it doesn't try to reformat the file to ascii. http://www.cplusplus.com/forum/general/21018/
回答2:
The issue is this -- there is no guarantee that binary data written by another program (you said Matlab) can be read back by another program by merely casting, unless you know that the data written by this secondary program is the same as data written by your program.
It may not be sufficient to just cast -- you need to know the exact form of the data that is written. You need to know the binary format (for example IEEE), the number of bytes each value occupies, endianess, etc. so that you can interpret the data correctly.
What you should do is this -- write a small program that writes out the number you claim this file has to another file. Then look at the file you just wrote in a hex editor. Then take the file you're attempting to read that was created by MatLab and compare the contents side-by-side with the one you just wrote. Do you see a pattern? If not, then either you have to find one, or forget about it and get the two files to be the same.
来源:https://stackoverflow.com/questions/23066176/read-double-type-data-from-binary-file