Plotting Average curve for points in gnuplot

青春壹個敷衍的年華 提交于 2020-06-21 20:42:50

问题


[Current]

I am importing a text file in which the first column has simulation time (0~150) the second column has the delay (0.01~0.02).

1.000000 0.010007
1.000000 0.010010
2.000000 0.010013
2.000000 0.010016
.
.
.
149.000000 0.010045
149.000000 0.010048
150.000000 0.010052
150.000000 0.010055

which gives me the plot:


[Desired]

I need to plot an average line on it like shown in the following image with red line:


回答1:


Here is a gnuplot only solution with sample data:

set table "test.data"
set samples 1000
plot rand(0)+sin(x)
unset table

You should check the gnuplot demo page for a running average. I'm going to generalize this demo in terms of dynamically building the functions. This makes it much easier to change the number of points include in the average.

This is the script:

# number of points in moving average
n = 50

# initialize the variables
do for [i=1:n] {
    eval(sprintf("back%d=0", i))
}

# build shift function (back_n = back_n-1, ..., back1=x)
shift = "("
do for [i=n:2:-1] {
    shift = sprintf("%sback%d = back%d, ", shift, i, i-1)
} 
shift = shift."back1 = x)"
# uncomment the next line for a check
# print shift

# build sum function (back1 + ... + backn)
sum = "(back1"
do for [i=2:n] {
    sum = sprintf("%s+back%d", sum, i)
}
sum = sum.")"
# uncomment the next line for a check
# print sum

# define the functions like in the gnuplot demo
# use macro expansion for turning the strings into real functions
samples(x) = $0 > (n-1) ? n : ($0+1)
avg_n(x) = (shift_n(x), @sum/samples($0))
shift_n(x) = @shift

# the final plot command looks quite simple
set terminal pngcairo
set output "moving_average.png"
plot "test.data" using 1:2 w l notitle, \
     "test.data" using 1:(avg_n($2)) w l lc rgb "red" lw 3 title "avg\\_".n

This is the result:

The average lags quite a bit behind the datapoints as expected from the algorithm. Maybe 50 points are too many. Alternatively, one could think about implementing a centered moving average, but this is beyond the scope of this question. And, I also think that you are more flexible with an external program :)




回答2:


Edit

The updated question is about a moving average.

You can do this in a limited way with gnuplot alone, according to this demo.

But in my opinion, it would be more flexible to pre-process your data using a programming language like python or ruby and add an extra column for whatever kind of moving average you require.

The original answer is preserved below:


You can use fit. It seems you want to fit to a constant function. Like this:

f(x) = c

fit f(x) 'S1_delay_120_LT100_LU15_MU5.txt' using 1:2 every 5 via c

Then you can plot them both.

plot 'S1_delay_120_LT100_LU15_MU5.txt' using 1:2 every 5, \
f(x) with lines

Note that this is technique can be used with arbitrary functions, not just constant or lineair functions.




回答3:


Here's some replacement code for the top answer, which makes this also work for 1000+ points and much much faster. Only works in gnuplot 5.2 and later I guess

# number of points in moving average
n = 5000

array A[n]

samples(x) = $0 > (n-1) ? n : int($0+1)
mod(x) = int(x) % n
avg_n(x) = (A[mod($0)+1]=x, (sum [i=1:samples($0)] A[i]) / samples($0))



回答4:


I wanted to comment on Franky_GT, but somehow stackoverflow didn't let me.

However, Franky_GT, your answer works great!

A note for people plotting .xvg files (e.g. after doing analysis of MD simulations), if you don't add the following line:

set datafile commentschars "#@&"

Franky_GT's moving average code will result in this error:

unknown type in imag()

I hope this is of use to anyone.




回答5:


For gnuplot >=5.2, probably the most efficient solution is using an array like @Franky_GT's solution. However, it uses the pseudocolumn 0 (see help pseudocolumns). In case you have some empty lines in your data $0 will be reset to 0 which eventually might mess up your average.

This solution uses an index t to count up the datalines and a second array X[] in case a centered moving average is desired. Datapoints don't have to be equidistant in x. At the beginning there will not be enough datapoints for a centered average of N points so for the x-value it will use every second point and the other will be NaN, that's why set datafile missing NaN is necessary to plot a connected line at the beginning.

Code:

### moving average over N points
reset session

# create some test data
set print $Data
    y = 0
    do for [i=1:5000] {
        print sprintf("%g %g", i, y=y+rand(0)*2-1) 
    }
set print

# average over N values
N = 250
array Avg[N]
array X[N]

MovAvg(col) = (Avg[(t-1)%N+1]=column(col), n = t<N ? t : N, t=t+1, (sum [i=1:n] Avg[i])/n)
MovAvgCenterX(col) = (X[(t-1)%N+1]=column(col), n = t<N ? t%2 ? NaN : (t+1)/2 : ((t+1)-N/2)%N+1, n==n ? X[n] : NaN)   # be aware: gnuplot does integer division here

set datafile missing NaN

plot $Data u 1:2 w l ti "Data", \
     t=1 '' u 1:(MovAvg(2)) w l lc rgb "red" ti sprintf("Moving average over %d",N), \
     t=1 '' u (MovAvgCenterX(1)):(MovAvg(2)) w l lw 2 lc rgb "green" ti sprintf("Moving average centered over %d",N)

### end of code

Result:



来源:https://stackoverflow.com/questions/42855285/plotting-average-curve-for-points-in-gnuplot

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!