Fortran read mixed text and numbers

问题

I am using Fortran 90 to read a file that contains data in the following format

number# 125 var1= 2 var2= 1 var3: 4
        .
        .
        .
        .
number# 234 var1= 3 var2= 5 var3: 1

I tried the following command and works fine

read (2,*)  tempstr , my_param(1), tempstr , my_param(2), tempstr , my_param(3)

Problem is when the numbers become larger and there is no space between string and number, i.e. the data looks as following:

number# 125 var1= 2 var2=124 var3: 4

I tried

     read (2,512)  my_param(1), my_param(2), my_param(3)

512 format('number#', i, 'var1=', i, 'var2=', i, 'var3:', i)

It reads all number as zero

I can't switch to some other language. The data set is huge, so I can't pre-process it. Also, the delimiters are not the same every time. Can someone please help with the problem?

Thanks in advance

回答1:

While I still stand with my original answer, particularly because the input data is already so close to what a namelist file would look like, let's assume that you really can't make any preprocessing of the data beforehand.

The next best thing is to read in the whole line into a character(len=<enough>) variable, then extract the values out of that with String Manipulation. Something like this:

program mixed2
    implicit none
    integer :: num, val1, val2, val3
    character(len=50) :: line
    integer :: io_stat

    open(unit=100, file='data.dat', action='READ', status='OLD')
    do
        read(100, '(A)', iostat=io_stat) line
        if (io_stat /= 0) exit
        call get_values(line, num, val1, val2, val3)
        print *, num, val1, val2, val3
    end do
    close(100)

    contains

        subroutine get_values(line, n, v1, v2, v3)
            implicit none
            character(len=*), intent(in) :: line
            integer, intent(out) :: n, v1, v2, v3
            integer :: idx

            ! Search for "number#"
            idx = index(line, 'number#') + len('number#')

            ! Get the integer after that word
            read(line(idx:idx+3), '(I4)') n

            idx = index(line, 'var1') + len('var1=')
            read(line(idx:idx+3), '(I4)') v1

            idx = index(line, 'var2') + len('var3=')
            read(line(idx:idx+3), '(I4)') v2

            idx = index(line, 'var3') + len('var3:')
            read(line(idx:idx+3), '(I4)') v3
        end subroutine get_values
end program mixed2

Please note that I have not included any error/sanity checking. I'll leave that up to you.

回答2:

First up, 720 thousand lines is not too much for pre-processing. Tools like sed and awk work mostly on a line-by-line basis, so they scale really well.

What I have actually done was to convert the data in such a way that I could use namelists:

$ cat preprocess.sed

# Add commas between values
# Space followed by letter -> insert comma
s/ \([[:alpha:]]\)/ , \1/g

# "number" is a key word in Fortran, so replace it with num
s/number/num/g

# Replace all possible data delimitors with the equals character
s/[#:]/=/g

# add the '&mydata' namelist descriptor to the beginning
s/^/\&mydata /1

# add the namelist closing "/" character to the end of the line:
s,$,/,1

$ sed -f preprocess.sed < data.dat > data.nml

Check that the data was correctly preprocessed:

$ tail -3 data.dat
number#1997 var1=114 var2=130 var3:127
number#1998 var1=164 var2=192 var3: 86
number#1999 var1=101 var2= 48 var3:120

$ tail -3 data.nml
&mydata num=1997 , var1=114 , var2=130 , var3=127/
&mydata num=1998 , var1=164 , var2=192 , var3= 86/
&mydata num=1999 , var1=101 , var2= 48 , var3=120/

Then you can read it with this fortran program:

program read_mixed
    implicit none
    integer :: num, var1, var2, var3
    integer :: io_stat
    namelist /mydata/ num, var1, var2, var3

    open(unit=100, file='data.nml', status='old', action='read')
    do
        read(100, nml=mydata, iostat=io_stat)
        if (io_stat /= 0) exit
        print *, num, var1, var2, var3
    end do
    close(100)
end program read_mixed

来源：https://stackoverflow.com/questions/38133083/fortran-read-mixed-text-and-numbers

标签

fortran

readfile

fortran90