Using awk to interpolate data column based in a data file with date and time

后端 未结 1 1614
时光取名叫无心
时光取名叫无心 2021-01-17 06:52

The following file has multiple columns with date, time and incomplete data set as shown using a simple file

# Matrix.txt
13.09.2016:23:44:10;;4.0
13.09.2016         


        
1条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-01-17 07:34

    Here is one solution in Gnu awk. It runs twice for the first given data file, remembers first and last data points (y1, y2) and their timestamps (x2, x2), computes slopes of the points (k=(y2-y1)/(x2-x1)) and inter- and extrapolates values for empty elements ((y=(x1-x)+y1).

    It's not fool proof, it doesn't check for division by zeroes or if there are two points for the slopes or any other checks whatsoever.

    $ cat inexpolator.awk
    BEGIN {
        FS=OFS=";"
        ARGC=3; ARGV[2]=ARGV[1]        # run it twice for first file
    }
    BEGINFILE {                        # on the second round
            for(i in p)                # compute the slopes
                k[i]=(y2[i]-y1[i])/(x2[i]-x1[i])
    }
    {
        split($1,a,"[:.]")             # reformat the timestamp
        ts=mktime(a[3] " " a[2] " " a[1] " " a[4] " " a[5] " " a[6])
    }
    NR==FNR {                          # remember first and last points for slopes
        for(i=2;i<=NF;i++) {
            p[i]
            if(y1[i]=="") { y1[i]=$i; x1[i]=ts }
            if($i!="") { y2[i]=$i; x2[i]=ts }
        }
        next                           # only on the first round
    }
    {                                  # reformat ts again for output
        printf "%s", strftime("%d.%m.%Y:%H:%M:%S",ts) OFS  # print ts
        for(i=2;i<=NF;i++) {
            if($i=="") $i=k[i]*(ts-x1[i])+y1[i]            # compute missing points
            printf "%.1f%s", $i, (i

    Run it:

    $ awk -f inexpolator.awk Matrix.txt
    13.09.2016:23:44:10;0.0;4.0
    13.09.2016:23:44:20;10.0;5.0
    13.09.2016:23:44:30;20.0;6.0
    13.09.2016:23:44:40;30.0;7.0
    

    0 讨论(0)
提交回复
热议问题