Plotting Average curve for points in gnuplot

后端 未结 5 622
夕颜
夕颜 2020-12-19 10:03

[Current]

I am importing a text file in which the first column has simulation time (0~150) the second column has the delay (0.01~0.02).

1.000000 0.01         


        
相关标签:
5条回答
  • 2020-12-19 10:03

    For gnuplot >=5.2, probably the most efficient solution is using an array like @Franky_GT's solution. However, it uses the pseudocolumn 0 (see help pseudocolumns). In case you have some empty lines in your data $0 will be reset to 0 which eventually might mess up your average.

    This solution uses an index t to count up the datalines and a second array X[] in case a centered moving average is desired. Datapoints don't have to be equidistant in x. At the beginning there will not be enough datapoints for a centered average of N points so for the x-value it will use every second point and the other will be NaN, that's why set datafile missing NaN is necessary to plot a connected line at the beginning.

    Code:

    ### moving average over N points
    reset session
    
    # create some test data
    set print $Data
        y = 0
        do for [i=1:5000] {
            print sprintf("%g %g", i, y=y+rand(0)*2-1) 
        }
    set print
    
    # average over N values
    N = 250
    array Avg[N]
    array X[N]
    
    MovAvg(col) = (Avg[(t-1)%N+1]=column(col), n = t<N ? t : N, t=t+1, (sum [i=1:n] Avg[i])/n)
    MovAvgCenterX(col) = (X[(t-1)%N+1]=column(col), n = t<N ? t%2 ? NaN : (t+1)/2 : ((t+1)-N/2)%N+1, n==n ? X[n] : NaN)   # be aware: gnuplot does integer division here
    
    set datafile missing NaN
    
    plot $Data u 1:2 w l ti "Data", \
         t=1 '' u 1:(MovAvg(2)) w l lc rgb "red" ti sprintf("Moving average over %d",N), \
         t=1 '' u (MovAvgCenterX(1)):(MovAvg(2)) w l lw 2 lc rgb "green" ti sprintf("Moving average centered over %d",N)
    
    ### end of code
    

    Result:

    0 讨论(0)
  • 2020-12-19 10:11

    Here's some replacement code for the top answer, which makes this also work for 1000+ points and much much faster. Only works in gnuplot 5.2 and later I guess

    # number of points in moving average
    n = 5000
    
    array A[n]
    
    samples(x) = $0 > (n-1) ? n : int($0+1)
    mod(x) = int(x) % n
    avg_n(x) = (A[mod($0)+1]=x, (sum [i=1:samples($0)] A[i]) / samples($0))
    
    0 讨论(0)
  • 2020-12-19 10:22

    Edit

    The updated question is about a moving average.

    You can do this in a limited way with gnuplot alone, according to this demo.

    But in my opinion, it would be more flexible to pre-process your data using a programming language like python or ruby and add an extra column for whatever kind of moving average you require.

    The original answer is preserved below:


    You can use fit. It seems you want to fit to a constant function. Like this:

    f(x) = c
    
    fit f(x) 'S1_delay_120_LT100_LU15_MU5.txt' using 1:2 every 5 via c
    

    Then you can plot them both.

    plot 'S1_delay_120_LT100_LU15_MU5.txt' using 1:2 every 5, \
    f(x) with lines
    

    Note that this is technique can be used with arbitrary functions, not just constant or lineair functions.

    0 讨论(0)
  • 2020-12-19 10:22

    I wanted to comment on Franky_GT, but somehow stackoverflow didn't let me.

    However, Franky_GT, your answer works great!

    A note for people plotting .xvg files (e.g. after doing analysis of MD simulations), if you don't add the following line:

    set datafile commentschars "#@&"
    

    Franky_GT's moving average code will result in this error:

    unknown type in imag()
    

    I hope this is of use to anyone.

    0 讨论(0)
  • 2020-12-19 10:25

    Here is a gnuplot only solution with sample data:

    set table "test.data"
    set samples 1000
    plot rand(0)+sin(x)
    unset table
    

    You should check the gnuplot demo page for a running average. I'm going to generalize this demo in terms of dynamically building the functions. This makes it much easier to change the number of points include in the average.

    This is the script:

    # number of points in moving average
    n = 50
    
    # initialize the variables
    do for [i=1:n] {
        eval(sprintf("back%d=0", i))
    }
    
    # build shift function (back_n = back_n-1, ..., back1=x)
    shift = "("
    do for [i=n:2:-1] {
        shift = sprintf("%sback%d = back%d, ", shift, i, i-1)
    } 
    shift = shift."back1 = x)"
    # uncomment the next line for a check
    # print shift
    
    # build sum function (back1 + ... + backn)
    sum = "(back1"
    do for [i=2:n] {
        sum = sprintf("%s+back%d", sum, i)
    }
    sum = sum.")"
    # uncomment the next line for a check
    # print sum
    
    # define the functions like in the gnuplot demo
    # use macro expansion for turning the strings into real functions
    samples(x) = $0 > (n-1) ? n : ($0+1)
    avg_n(x) = (shift_n(x), @sum/samples($0))
    shift_n(x) = @shift
    
    # the final plot command looks quite simple
    set terminal pngcairo
    set output "moving_average.png"
    plot "test.data" using 1:2 w l notitle, \
         "test.data" using 1:(avg_n($2)) w l lc rgb "red" lw 3 title "avg\\_".n
    

    This is the result:

    The average lags quite a bit behind the datapoints as expected from the algorithm. Maybe 50 points are too many. Alternatively, one could think about implementing a centered moving average, but this is beyond the scope of this question. And, I also think that you are more flexible with an external program :)

    0 讨论(0)
提交回复
热议问题