simple reading of fortran binary data not so simple in python

喜你入骨 提交于 2019-12-05 23:35:39

The size you get is due to alignment, try struct.calcsize('idi') to verify the size is actually 20 after alignment. To use the native byte-order without alignment, specify struct.calcsize('=idi') and adapt it to your example.

For more info on the struct module, check http://docs.python.org/2/library/struct.html

The struct module is mainly intended to interoperate with C structures and because of this it aligns the data members. idi corresponds to the following C structure:

struct
{
   int int1;
   double double1;
   int int2;
}

double entries require 8 byte alignment in order to function efficiently (or even correctly) with most CPU load operations. That's why 4 bytes of padding are being added between int1 and double1, which increases the size of the structure to 20 bytes. The same padding is performed by the struct module, unless you suppress the padding by adding < (on little endian machines) or > (on big endian machines), or simply = at the beginning of the format string:

>>> struct.unpack('idi', d)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
struct.error: unpack requires a string argument of length 20
>>> struct.unpack('<idi', d)
(-1345385859, 2038.0682530887993, 428226400)
>>> struct.unpack('=idi', d)
(-1345385859, 2038.0682530887993, 428226400)

(d is a string of 16 random chars.)

Arjaan Buijk

I recommend using arrays to read a file that was written by FORTRAN with UNFORMATTED, SEQUENTIAL.

Your specific example using arrays, would be as follows:

import array
binfile=open('myfile','rb')
pad = array.array('i')
ver = array.array('d')
pad.fromfile(binfile,1)   # read the length of the record 
ver.fromfile(binfile,1)   # read the actual data written by FORTRAN
pad.fromfile(binfile,1)   # read the length of the record

If you have FORTRAN records that write arrays of integers and doubles, which is very common, your python would look something like this:

import array
binfile=open('myfile','rb')
pad = array.array('i')
my_integers = array.array('i')
my_floats = array.array('d')
number_of_integers = 1000 # replace with how many you need to read
number_of_floats = 10000 # replace with how many you need to read
pad.fromfile(binfile,1)   # read the length of the record
my_integers.fromfile(binfile,number_of_integers) # read the integer data
my_floats.fromfile(binfile,number_of_floats)     # read the double data
pad.fromfile(binfile,1)   # read the length of the record

Final comment is that if you have characters on the file, you can read those into an array as well, and then decode it into a string. Something like this:

import array
binfile=open('myfile','rb')
pad = array.array('i')
my_characters = array.array('B')
number_of_characters = 63 # replace with number of characters to read
pad.fromfile(binfile,1)   # read the length of the record 
my_characters.fromfile(binfile,number_of_characters ) # read the data
my_string = my_characters.tobytes().decode(encoding='utf_8') 
pad.fromfile(binfile,1)   # read the length of the record
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!