The smooth.spline function in R allows a tradeoff between roughness (as defined by the integrated square of the second derivative) and fitting the points (as defined by summing the squares of the residuals). This tradeoff is accomplished by the spar or df parameter. At one extreme you get the least squares line, and the other you get a very wiggly curve which intersects all of the data points (or the mean if you have duplicated x values with different y values)
I have looked at scipy.interpolate.UnivariateSpline and other spline variants in Python, however, they seem to only tradeoff by increasing the number of knots, and setting a threshold (called s) for the allowed SS residuals. By contrast, the smooth.spline in R allows having knots at all the x values, without necessarily having a wiggly curve that hits all the points -- the penalty comes from the second derivative.
Does Python have a spline fitting mechanism that behaves in this way? Allowing all knots but penalizing the second derivative?
I've been looking for exactly the same thing, but would rather not have to translate the code to Python. The Splinter package seems like an option, however: https://github.com/bgrimstad/splinter
From research on google, I concluded that
By contrast, the smooth.spline in R allows having knots at all the x values, without necessarily having a wiggly curve that hits all the points -- the penalty comes from the second derivative.
You can use R functions in Python with rpy2
:
import rpy2.robjects as robjects
r_y = robjects.FloatVector(y_train)
r_x = robjects.FloatVector(x_train)
r_smooth_spline = robjects.r['smooth.spline'] #extract R function# run smoothing function
spline1 = r_smooth_spline(x=r_x, y=r_y, spar=0.7)
ySpline=np.array(robjects.r['predict'](spline1,robjects.FloatVector(x_smooth)).rx2('y'))
plt.plot(x_smooth,ySpline)
If you want to directly set lambda
: spline1 = r_smooth_spline(x=r_x, y=r_y, lambda=42)
doesn't work, because lambda
has already another meaning in Python, but there is a solution: How to use the lambda argument of smooth.spline in RPy WITHOUT Python interprating it as lambda.
To get the code running you first need to define the data x_train
and y_train
and you can define x_smooth=np.array(np.linspace(-3,5,1920)).
if you want to plot it between -3 and 5 in Full-HD-resolution.
来源:https://stackoverflow.com/questions/29312005/is-there-a-python-equivalent-to-the-smooth-spline-function-in-r