问题
I am using Fortran 90 to read a file that contains data in the following format
number# 125 var1= 2 var2= 1 var3: 4
.
.
.
.
number# 234 var1= 3 var2= 5 var3: 1
I tried the following command and works fine
read (2,*) tempstr , my_param(1), tempstr , my_param(2), tempstr , my_param(3)
Problem is when the numbers become larger and there is no space between string and number, i.e. the data looks as following:
number# 125 var1= 2 var2=124 var3: 4
I tried
read (2,512) my_param(1), my_param(2), my_param(3)
512 format('number#', i, 'var1=', i, 'var2=', i, 'var3:', i)
It reads all number as zero
I can't switch to some other language. The data set is huge, so I can't pre-process it. Also, the delimiters are not the same every time. Can someone please help with the problem?
Thanks in advance
回答1:
While I still stand with my original answer, particularly because the input data is already so close to what a namelist file would look like, let's assume that you really can't make any preprocessing of the data beforehand.
The next best thing is to read in the whole line into a character(len=<enough>)
variable, then extract the values out of that with String Manipulation. Something like this:
program mixed2
implicit none
integer :: num, val1, val2, val3
character(len=50) :: line
integer :: io_stat
open(unit=100, file='data.dat', action='READ', status='OLD')
do
read(100, '(A)', iostat=io_stat) line
if (io_stat /= 0) exit
call get_values(line, num, val1, val2, val3)
print *, num, val1, val2, val3
end do
close(100)
contains
subroutine get_values(line, n, v1, v2, v3)
implicit none
character(len=*), intent(in) :: line
integer, intent(out) :: n, v1, v2, v3
integer :: idx
! Search for "number#"
idx = index(line, 'number#') + len('number#')
! Get the integer after that word
read(line(idx:idx+3), '(I4)') n
idx = index(line, 'var1') + len('var1=')
read(line(idx:idx+3), '(I4)') v1
idx = index(line, 'var2') + len('var3=')
read(line(idx:idx+3), '(I4)') v2
idx = index(line, 'var3') + len('var3:')
read(line(idx:idx+3), '(I4)') v3
end subroutine get_values
end program mixed2
Please note that I have not included any error/sanity checking. I'll leave that up to you.
回答2:
First up, 720 thousand lines is not too much for pre-processing. Tools like sed
and awk
work mostly on a line-by-line basis, so they scale really well.
What I have actually done was to convert the data in such a way that I could use namelists:
$ cat preprocess.sed
# Add commas between values
# Space followed by letter -> insert comma
s/ \([[:alpha:]]\)/ , \1/g
# "number" is a key word in Fortran, so replace it with num
s/number/num/g
# Replace all possible data delimitors with the equals character
s/[#:]/=/g
# add the '&mydata' namelist descriptor to the beginning
s/^/\&mydata /1
# add the namelist closing "/" character to the end of the line:
s,$,/,1
$ sed -f preprocess.sed < data.dat > data.nml
Check that the data was correctly preprocessed:
$ tail -3 data.dat
number#1997 var1=114 var2=130 var3:127
number#1998 var1=164 var2=192 var3: 86
number#1999 var1=101 var2= 48 var3:120
$ tail -3 data.nml
&mydata num=1997 , var1=114 , var2=130 , var3=127/
&mydata num=1998 , var1=164 , var2=192 , var3= 86/
&mydata num=1999 , var1=101 , var2= 48 , var3=120/
Then you can read it with this fortran program:
program read_mixed
implicit none
integer :: num, var1, var2, var3
integer :: io_stat
namelist /mydata/ num, var1, var2, var3
open(unit=100, file='data.nml', status='old', action='read')
do
read(100, nml=mydata, iostat=io_stat)
if (io_stat /= 0) exit
print *, num, var1, var2, var3
end do
close(100)
end program read_mixed
来源:https://stackoverflow.com/questions/38133083/fortran-read-mixed-text-and-numbers