问题
I have an ASCII file that looks like:
____________________________________________
Header1 ...
Header2 ...
Header3 ...
block(1)data1 block(1)data2 block(1)data3
block(1)data4 block(1)data5 block(1)data6
block(2)data1 block(2)data2 block(2)data3
block(2)data4 block(2)data5 block(2)data6
...
block(n)data1 block(n)data2 block(n)data3
block(n)data4 block(n)data5 block(n)data6
____________________________________________
I would like to convert it into an ASCII file that looks like:
____________________________________________
HeaderA ...
HeaderB ...
block(n)data1 block(n)data2 block(n)data3
block(n)data4 block(n)data5 block(n)data6
block(n-1)data1 block(n-1)data2 block(n-1)data3
block(n-1)data4 block(n-1)data5 block(n-1)data6
....
block(1)data1 block(1)data2 block(1)data3
block(1)data4 block(1)data5 block(1)data6
____________________________________________
Data are mainly real numbers, and size of the data set is way too big to use allocatable arrays. So I have somehow to read and write on the fly.
I could not find a way to read-or-write backward in a file.
回答1:
I would not directly use Fortran, but more a sequence of Linux commands (or Cygwin / GNU utils on Windows). Fortran is possible as well (see second possibility).
An outline (based on OS commands):
- get total number of lines (e/g.
wc
) - get first 3 lines out of the file (e.g. with
head
) to fileresult file
- process the main part
- take last but 3 lines (e.g. with
tail
) - run the result through e.g a
awk
script joining the relevant lines - run
tac
on the result - run another
awk
script splitting the lines - append result to
result file
- take last but 3 lines (e.g. with
Another idea would be (in a programming language):
- create an array with with the beginning file position of each block (so result of
ftell
). - move the header to the new file
- run through the array created above, from the end to the beginning
- do an
fseek
to the indicated position - read relevant number of lines and write them out again
- do an
回答2:
way to big to use allocatable arrays.
If the data fits in memory, you can do it. I've tested it out, a file
header(1)
header(2)
header(3)
block(1).data1 block(1).data2 block(1).data3
block(1).data4 block(1).data5 block(1).data6
block(2).data1 block(2).data2 block(2).data3
block(2).data4 block(2).data5 block(2).data6
...
block(9999998).data1 block(9999998).data2 block(9999998).data3
block(9999998).data4 block(9999998).data5 block(9999998).data6
block(9999999).data1 block(9999999).data2 block(9999999).data3
block(9999999).data4 block(9999999).data5 block(9999999).data6
with a file size of 1.2GB could be reversed by this little awk script:
#!/usr/bin/awk
# if line contains word "header", print immediately, move on to next line.
/header/ {print; next}
# move every line to memory.
{
line[n++] = $0
}
# When finished, print them out in order n-1, n, n-3, n-2, n-5, n-4, ...
END {
for (i=n-2; i>=0; i-=2) {
print(line[i])
print(line[i+1])
}
}
in under 2 minutes.
If this is really not possible, you need to do what @high-performance-mark said and read it in in blocks that are manageable, reverse it in memory, then concatenate them together at the end. Here's my version:
program reverse_order
use iso_fortran_env, only: IOSTAT_END
implicit none
integer, parameter :: max_blocks_in_memory = 10000
integer, parameter :: max_line_length=100
character(len=max_line_length) :: line
character(len=max_line_length) :: data(2, max_blocks_in_memory)
character(len=*), parameter :: INFILE='data.txt'
character(len=*), parameter :: OUTFILE='reversed_data.txt'
character(len=*), parameter :: TMP_FILE_FORMAT='("/tmp/", I10.10,".txt")'
character(len=len("/tmp/XXXXXXXXXX.txt")) :: tmp_file_name
integer :: in_unit, out_unit, tmp_unit
integer :: num_headers, i, j, tmp_file_number
integer :: ios
! Open the input and output files
open(newunit=in_unit, file=INFILE, action="READ", status='OLD')
open(newunit=out_unit, file=OUTFILE, action='WRITE', status='REPLACE')
! Transfer the headers to the output file immediately.
num_headers = 0
do
read(in_unit, '(A)') line
if (index(line, 'header') == 0) exit
num_headers = num_headers + 1
write(out_unit, '(A)') trim(line)
end do
! We've already read the first data line, so let's rewind and start anew.
rewind(in_unit)
! move past the headers.
do i = 1, num_headers
read(in_unit, *)
end do
tmp_file_number = 0
! Read the data from the input line max_blocks_in_memory blocks at a time.
read_loop : do
do i = 1, max_blocks_in_memory
read(in_unit, '(A)', iostat=ios) data(1, i)
if (ios == IOSTAT_END) then ! Reached the end of the input file.
if (i > 1) then ! Still have final values in memory, write them
! to output immediately.
do j = i-1, 1, -1
write(out_unit, '(A)') trim(data(1, j))
write(out_unit, '(A)') trim(data(2, j))
end do
end if
exit read_loop
end if
read(in_unit, '(A)') data(2, i)
end do
! Reasd a block of data, write it in reverse order into a temporary file.
tmp_file_number = tmp_file_number + 1
write(tmp_file_name, TMP_FILE_FORMAT) tmp_file_number
open(newunit=tmp_unit, file=tmp_file_name, action="WRITE", status="NEW")
do j = max_blocks_in_memory, 1, -1
write(tmp_unit, '(A)') data(1, j)
write(tmp_unit, '(A)') data(2, j)
end do
close(tmp_unit)
end do read_loop
! Finished with input file, don't need it any more.
close(unit=in_unit)
! Concatenate all the temporary files in reverse order to the output file.
do j = tmp_file_number, 1, -1
write(tmp_file_name, TMP_FILE_FORMAT) j
open(newunit=tmp_unit, file=tmp_file_name, action="READ", status="OLD")
do
read(tmp_unit, '(A)', iostat=ios) line
if (ios == IOSTAT_END) exit
write(out_unit, '(A)') trim(line)
end do
close(tmp_unit, status="DELETE") ! Done with this file, delete it after closing.
end do
close(unit=out_unit)
end program reverse_order
回答3:
Well, I sort of have an answer, but it didn't work, perhaps due to compiler bugs or my rudimentary understanding of file positioning in Fortran. My attempt was to open the input file with access = 'stream'
and form = 'formatted'
. This way I could push the line positions onto a stack and pop them off so they come out in reverse order. Then, walking through the lines in reverse order I could write them into the ourput file.
program readblk
implicit none
integer iunit, junit
integer i, size
character(20) line
type LLnode
integer POS
type(LLnode), pointer :: next => NULL()
end type LLnode
type(LLNODE), pointer :: list => NULL(), current => NULL()
integer POS, temp(2)
open(newunit=iunit,file='readblk.txt',status='old',access='stream',form='formatted')
open(newunit=junit,file='writeblk.txt',status='replace')
do i = 1, 3
do
read(iunit,'(a)',advance='no',EOR=10,size=size) line
write(junit,'(a)',advance='no') line
end do
10 continue
write(junit,'(a)') line(1:size)
end do
do
inquire(iunit,POS=POS)
allocate(current)
current%POS = POS
current%next => list
list => current
read(iunit,'()',end=20)
end do
20 continue
current => list
list => current%next
deallocate(current)
do while(associated(list))
temp(2) = list%POS
current => list%next
deallocate(list)
temp(1) = current%POS
list => current%next
deallocate(current)
do i = 1, 2
write(*,*) temp(i)
read(iunit,'(a)',advance='no',EOR=30,size=size,POS=temp(i)) line
write(junit,'(a)',advance='no') line
do
read(iunit,'(a)',advance='no',EOR=30,size=size) line
write(junit,'(a)',advance='no') line
end do
30 continue
write(junit,'(a)') line(1:size)
end do
end do
end program readblk
Here is my input file:
Header line 1
Header line 2
Header line 3
1a34567890123456789012345678901234567890
1b34567890123456789012345678901234567890
2a34567890123456789012345678901234567890
2b34567890123456789012345678901234567890
3a34567890123456789012345678901234567890
3b34567890123456789012345678901234567890
Now with ifort
my file positions were printed out as
214
256
130
172
44
88
Note that the first line is at the end of record 3 instead of the beginning of record 4. The output file was
Header line 1
Header line 2
Header line 3
3a34567890123456789012345678901234567890
3b34567890123456789012345678901234567890
2a34567890123456789012345678901234567890
2b34567890123456789012345678901234567890
1a34567890123456789012345678901234567890
With gfortran, the file positions printed out as
214
256
130
172
46
88
This time the first line is at the beginning of record 4 as I would expect. However, the output file had unfortunate contents
Header line 1
Header line 2
Header line 3
3a34567890123456789012345678901234567890
3b34567890123456789012345678901234567890
2a34567890123456789012345678901234567890
2b34567890123456789012345678901234567890
3a34567890123456789012345678901234567890
3b345678901234567890123456789012341a34567890123456789012345678901234567890
I had hoped for a more positive result. I can't tell whether my results are due to poor programming or compiler bugs, but I posted in case someone else could possibly get my pure Fortran solution to work.
来源:https://stackoverflow.com/questions/51328728/in-fortran-how-to-write-backward-from-a-file-to-another-file-by-blocks-of-line