I am working with the output from a model in which there are parameter estimates that may not follow a-priori expectations. I would like to write a function that forces thes
1.
Since the objective is quadratic and the constraints linear,
you can use solve.QP
.
It finds the b
that minimizes
(1/2) * t(b) %*% Dmat %*% b - t(dvec) %*% b
under the constraints
t(Amat) %*% b >= bvec.
Here, we want b
that minimizes
sum( (b-betas)^2 ) = sum(b^2) - 2 * sum(b*betas) + sum(beta^2)
= t(b) %*% t(b) - 2 * t(b) %*% betas + sum(beta^2).
Since the last term, sum(beta^2)
, is constant, we can drop it,
and we can set
Dmat = diag(n)
dvec = betas.
The constraints are
b[1] <= b[2]
b[2] <= b[3]
...
b[n-1] <= b[n]
i.e.,
-b[1] + b[2] >= 0
- b[2] + b[3] >= 0
...
- b[n-1] + b[n] >= 0
so that t(Amat)
is
[ -1 1 ]
[ -1 1 ]
[ -1 1 ]
[ ... ]
[ -1 1 ]
and bvec
is zero.
This leads to the following code.
# Sample data
betas <- c(1.2, 1.3, 1.6, 1.4, 2.2)
# Optimization
n <- length(betas)
Dmat <- diag(n)
dvec <- betas
Amat <- matrix(0,nr=n,nc=n-1)
Amat[cbind(1:(n-1), 1:(n-1))] <- -1
Amat[cbind(2:n, 1:(n-1))] <- 1
t(Amat) # Check that it looks as it should
bvec <- rep(0,n-1)
library(quadprog)
r <- solve.QP(Dmat, dvec, Amat, bvec)
# Check the result, graphically
plot(betas)
points(r$solution, pch=16)
2.
You can use constrOptim
in the same way (the objective function can be arbitrary, but the constraints have to be linear).
3.
More generally, you can use optim
if you reparametrize the problem
into a non-constrained optimization problem,
for instance
b[1] = exp(x[1])
b[2] = b[1] + exp(x[2])
...
b[n] = b[n-1] + exp(x[n-1]).
There are a few examples here or there.
Alright, this is starting to take form, but still has some bugs. Based on the conversation in chat with @Joran, it seems I can include a conditional that will set the loss function to an arbitrarily large value if the values are not in order. This seems to work IF the discrepancy occurs between the first two coefficients, but not thereafter. I'm having a hard time parsing out why that would be the case.
Function to minimize:
f <- function(x, x0) {
x1 <- x[1]
x2 <- x[2]
x3 <- x[3]
x4 <- x[4]
x5 <- x[5]
loss <- (x1 - x0[1]) ^ 2 +
(x2 - x0[2]) ^ 2 +
(x3 - x0[3]) ^ 2 +
(x4 - x0[4]) ^ 2 +
(x5 - x0[5]) ^ 2
#Make sure the coefficients are in order
if any(diff(c(x1,x2,x3,x4,x5)) > 0) loss = 10000000
return(loss)
}
Working example (sort of, it seems the loss would be minimized if b0 = 1.24
?):
> betas <- c(1.22, 1.24, 1.18, 1.12, 1.10)
> optim(betas, f, x0 = betas)$par
[1] 1.282 1.240 1.180 1.120 1.100
Non-working example (note that the third element is still larger than the second:
> betas <- c(1.20, 1.15, 1.18, 1.12, 1.10)
> optim(betas, f, x0 = betas)$par
[1] 1.20 1.15 1.18 1.12 1.10