I'm trying to fit a Boltzmann sigmoid 1/(1+exp((x-p1)/p2))
to this small experimental dataset:
xdata <- c(-60,-50,-40,-30,-20,-10,-0,10)
ydata <- c(0.04, 0.09, 0.38, 0.63, 0.79, 1, 0.83, 0.56)
I know that it is pretty simple to do it. For example, using nls
:
fit <-nls(ydata ~ 1/(1+exp((xdata-p1)/p2)),start=list(p1=mean(xdata),p2=-5))
I get the following results:
Formula: ydata ~ 1/(1 + exp((xdata - p1)/p2))
Parameters:
Estimate Std. Error t value Pr(>|t|)
p1 -33.671 4.755 -7.081 0.000398 ***
p2 -10.336 4.312 -2.397 0.053490 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1904 on 6 degrees of freedom
Number of iterations to convergence: 13
Achieved convergence tolerance: 7.079e-06
However, I need (due to theoretical reasons) the fitted curve to pass precisely through the point (-70, 0)
. Although the value of the fitted expression showed above passes near zero at x = -70
, it is not exactly zero, which is not what I want.
So, the question is: Is there a way to tell nls
(or some other function) to fit the same expression but forcing it to pass through a specified point?
Update:
As it has been mentioned in the comments, it is mathematically impossible to force the fit to go through the point (-70,0) using the function I provided (the Boltzmann sigmoid). On the other hand, @Cleb and @BenBolker have explained how to force the fit to go through any other point, for instance (-50, 0.09).
Building on @Cleb's answer, here's a way to pick a specified point the function must pass through and solve the resulting equation for one of the parameters:
dd <- data.frame(x=c(-60,-50,-40,-30,-20,-10,-0,10),
y=c(0.04, 0.09, 0.38, 0.63, 0.79, 1, 0.83, 0.56))
Initial fit (using plogis()
rather than 1/(1+exp(-...))
for convenience):
fit <- nls(y ~ plogis(-(x-p1)/p2),
data=dd,
start=list(p1=mean(dd$x),p2=-5))
Now plug in (x3,y3)
and solve for p2:
y3 = 1/(1+exp((x-p1)/p2))
logit(x) = qlogis(-x) = log(x/(1-x))
e.g. plogis(2)=0.88 -> qlogis(0.88)=2
qlogis(y3) = -(x-p1)/p2
p2 = -(x3-p1)/qlogis(y3)
Set up a function and plug it in for p2
:
p2 <- function(p1,x,y) {
-(x-p1)/qlogis(y)
}
fit2 <- nls(y ~ plogis(-(x-p1)/p2(p1,dd$x[3],dd$y[3])),
data=dd,
start=list(p1=mean(dd$x)))
Plot the results:
plot(y~x,data=dd,ylim=c(0,1.1))
xr <- data.frame(x = seq(min(dd$x),max(dd$x),len=200))
lines(xr$x,predict(fit,newdata=xr))
lines(xr$x,predict(fit2,newdata=xr),col=2)
It is not possible to force the fit to go through 0 using the function you provide (without an off-set) as we discussed in the comments below your question.
However, you can force the curve to go through other data points by setting weights
for individual data points. So e.g. if you give a data point A a weight equals 1 and a data point B a weight equals 1000, the data point B is much more important (in terms of the contribution to the sum of residuals which is going to be minimized) for the fit than A and the fit will therefore be forced to go through B.
Here is the entire code and I explain it in more detail below:
# your data
xdata <- c(-60, -50, -40, -30, -20, -10, -0, 10)
ydata <- c(0.04, 0.09, 0.38, 0.63, 0.79, 1, 0.83, 0.56)
plot(xdata, ydata, ylim=c(0, 1.1))
fit <-nls(ydata ~ 1 / (1 + exp((xdata - p1) / p2)), start=list(p1=mean(xdata), p2=-5))
# plot the fit
xr = data.frame(xdata = seq(min(xdata), max(xdata), len=200))
lines(xr$xdata, predict(fit, newdata=xr))
# set all weights to 1, do the fit again; the plot looks identical to the previous one
we = rep(1, length(xdata))
fit2 = nls(ydata ~ 1 / (1 + exp((xdata - p1) / p2)), weights=we, start=list(p1=mean(xdata) ,p2=-5))
lines(xr$xdata, predict(fit2, newdata=xr), col='blue')
# set weight for the data point -30,0.38, and fit again
we[3] = 1000
fit3 = nls(ydata ~ 1 / (1 + exp((xdata - p1) / p2)), weights=we, start=list(p1=mean(xdata), p2=-5))
lines(xr$xdata, predict(fit3, newdata=xr), col='red')
legend('topleft', c('fit without weights', 'fit with weights 1', 'weighted fit for -40,0.38'),
lty=c(1, 1, 1),
lwd=c(2.5, 2.5, 2.5),
col=c('black', 'blue', 'red'))
The output looks as follows; as you can see the fit now goes through the desired data point (red line):
So what is going on: I first fit as you did, then I fit with weights whereby all weights are set to 1; therefore, the plot looks identical to the one before and the blue line hides the black line. Then - for fit3
- I change the weight for the third data point to 1000 which means that it is now much more "important" for the least square fit than the other points and the new fit goes through this data point (the red line).
Here is also a second example where I changed the line
we[3] = 1000
to
we[2] = 1000
which forces the fit to go through the second data point:
If you want to get more information about the weights
argument you can read here: documentation
来源:https://stackoverflow.com/questions/31837610/forcing-nls-to-fit-a-curve-passing-through-a-specified-point