Reading writing fortran direct access unformatted files with different compilers

前端 未结 1 1681
臣服心动
臣服心动 2020-12-01 15:22

I have a section in a program that writes a direct-access binary file as follows:

open (53, file=filename, form=\'unformatted\', status=\'unknown\',
& ac         


        
相关标签:
1条回答
  • 2020-12-01 15:43

    Ifort and gfortran do not use the same block size for record length by default. In ifort, the value of recl in your open statement is in 4-byte blocks, so your record length isn't 985,600 bytes, it is 3,942,400 bytes long. That means the records are written at intervals of 3.9 million bytes.

    gfortran uses a recl block size of 1 byte and your record length is 985,600 byes. When you read the first record, everything works, but when you read the second record you look at 985,600 bytes into the file but the data is at 3,942,400 bytes into the file. This also means you are wasting a ton of data in the file, as you are using only 1/4 of its size.

    There are a couple ways to fix this:

    • In ifort specify recl in 4-byte blocks, e.g. 320*385*2 instead of *8
    • In ifort, use the compile flag -assume byterecl to have recl values interpreted as bytes.
    • In gfortran compensate for the size and use recl=320*385*32 so that your reads are correctly positioned.

    A better way, however, is to engineer agnosticism in the recl unit size. You can use inquire to figure out the recl of an array. For example:

    real(kind=wp), allocatable, dimension(:,:) :: recltest
    integer :: reclen
    allocate(recltest(320,385))
    inquire(iolength=reclen) recltest
    deallocate(recltest)
    ...
    open (53, file=filename, form='unformatted', status='unknown',
    & access='direct',action='write',recl=reclen)
    ...
    OPEN(53, FILE=fname, form="unformatted", status="unknown", &
    access="direct", action="read", recl=reclen)
    

    This will set reclen to the value needed to store a 320x385 array based on the that compilers base unit for record length. If you use this when both writing and reading your code will work with both compilers without having to use compile-time flags in ifort or compensate with hardcoded recl differences between compilers.


    An illustrative example

    Testcase 1

    program test
      use iso_fortran_env
      implicit none
    
      integer(kind=int64), dimension(5) :: array
      integer :: io_output, reclen, i
      reclen = 5*8 ! 5 elements of 8 byte integers.
    
      open(newunit=io_output, file='output', form='unformatted', status='new', &
           access='direct', action='write', recl=reclen)
    
      array = [(i,i=1,5)]  
      write (io_output, rec=1) array
      array = [(i,i=101,105)]
      write (io_output, rec=2) array
      array = [(i,i=1001,1005)]
      write (io_output, rec=3) array
    
      close(io_output)
    end program test
    

    This program writes an array of 5 8-byte integers 3 times to the file in records 1,2 and 3. The array is 5*8 bytes and I have hardcoded that number as the recl value.

    Testcase 1 with gfortran 5.2

    I compiled this testcase with the command line:

    gfortran -o write-gfortran write.f90
    

    This produces the output file (interpreted with od -A d -t d8):

    0000000                    1                    2
    0000016                    3                    4
    0000032                    5                  101
    0000048                  102                  103
    0000064                  104                  105
    0000080                 1001                 1002
    0000096                 1003                 1004
    0000112                 1005
    0000120
    

    The arrays of 5 8-bye elements are packed contiguously into the file and record number 2 (101 ... 105) starts where we would expect it to at offset 40, which is the recl value in the file 5*8.

    Testcase 1 with ifort 16

    This is compiled similarly:

    ifort -o write-ifort write.f90
    

    And this, for the exact same code, produces the output file (interpreted with od -A d -t d8):

    0000000                    1                    2
    0000016                    3                    4
    0000032                    5                    0
    0000048                    0                    0
    *
    0000160                  101                  102
    0000176                  103                  104
    0000192                  105                    0
    0000208                    0                    0
    *
    0000320                 1001                 1002
    0000336                 1003                 1004
    0000352                 1005                    0
    0000368                    0                    0
    *
    0000480
    

    The data is all there but the file is full of 0 valued elements. The lines starting with * indicate every line between the offsets is 0. Record number 2 starts at offset 160 instead of 40. Notice that 160 is 40*4, where 40 is our specified recl of 5*8. By default ifort uses 4-byte blocks, so a recl of 40 means a physical record size of 160 bytes.

    If code compiled with gfortran were to read this, records 2,3 and 4 would contain all 0 elements and a read of record 5 would correctly read the array written as record 2 by ifort. An alternative to have gfortran read record 2 where it lies in the file would be to use recl=160 (4*5*4) so that the physical record size matches what was written by ifort.

    Another consequence of this is wasted space. Over-specifying the recl means you are using 4 times the necessary disk space to store your records.

    Testcase 1 with ifort 16 and -assume byterecl

    This was compiled as:

    ifort -assume byterecl -o write-ifort write.f90
    

    And produces the output file:

    0000000                    1                    2
    0000016                    3                    4
    0000032                    5                  101
    0000048                  102                  103
    0000064                  104                  105
    0000080                 1001                 1002
    0000096                 1003                 1004
    0000112                 1005
    0000120
    

    This produces the file as expected. The command line argument -assume byterecl tells ifort to interpret any recl values as bytes rather than double words (4-byte blocks). This will produce writes and reads that match code compiled with gfortran.

    Testcase 2

    program test
      use iso_fortran_env
      implicit none
    
      integer(kind=int64), dimension(5) :: array
      integer :: io_output, reclen, i
      inquire(iolength=reclen) array
      print *,'Using recl=',reclen
    
      open(newunit=io_output, file='output', form='unformatted', status='new', &
           access='direct', action='write', recl=reclen)
    
      array = [(i,i=1,5)]  
      write (io_output, rec=1) array
      array = [(i,i=101,105)]
      write (io_output, rec=2) array
      array = [(i,i=1001,1005)]
      write (io_output, rec=3) array
    
      close(io_output)
    end program test
    

    The only difference in this testcase is that I am inquiring the proper recl to represent my 40-byte array (5 8-byte integers).

    The output

    gfortran 5.2:

     Using recl=          40
    

    ifort 16, no options:

     Using recl=          10
    

    ifort 16, -assume byterecl:

     Using recl=          40
    

    We see that for the 1-byte blocks used by gfortran and ifort with the byterecl assumption that recl is 40, which equals our 40 byte array. We also see that by default, ifort uses a recl of 10, which means 10 4-byte blocks or 10 double words, both of which mean 40 bytes. All three of these testcases produce identical file output and read/writes from either compiler will function properly.

    Summary

    To have record-based, unformatted, direct data be portable between ifort and gfortran the easiest option is to just add -assume byterecl to the flags used by ifort. You really should have been doing this already since you are specifying record lengths in bytes, so this would be a straightforward change that probably has no consequences for you.

    The other alternative is to not worry about the option and use the inquire intrinsic to query the iolength for your array.

    0 讨论(0)
提交回复
热议问题