How to adjust an odd behaving Hessian to calculate standard errors with optim

问题

I am using a Kalman filter to estimate various Dynamic and Arbitrage free Nelson-Siegel models for yield curves. I give some starting values to optim and the algorithm converges just fine. However, when I want to calculate standard errors using the Hessian supplied by the optim algorithm, I get NaN's due to nonpositive values on the diagonal of the Variance covariance matrix. I think it is because I have a highly nonlinear function with many local optima, however it keeps happening for all starting values I try.

The function I use is optim together with the default Nelder-Mead algorithm. The command I use is opt_para<-optim(par=par0, fn=Kalman_filter, y=y, maturities=maturities,control=list(maxit=20000),hessian=TRUE) The starting values are given in par0, which is

> par0 [1] 9.736930e-01 1.046646e+00 5.936238e-01 4.444669e-02 2.889251e-07 6.646960e+00 7.715964e-01 9.945551e-01 9.663361e-01 [10] 6.000000e-01 6.000000e-01 6.000000e-01 6.000000e-02 5.000000e-01 5.000000e-01 5.000000e-01 5.000000e-01

The optimoutput that I get is

$par[1] 0.833208307 1.373442068 0.749313983 0.646577154 0.237102069 6.882644818 0.788775982 0.918378263 0.991982038 [10] 0.748509055 0.005115171 0.392213941 0.717186499 0.121525623 0.386227284 0.001970431 0.845279611

$value
[1] 575.7886

$counts
function gradient 
 5225       NA 

 $convergence
[1] 0

$message
NULL

I then use the following command to produce the standard errors of the estimates.

hessian<-opt_para$hessian fish_info<-solve(hessian,tol=1e-100) st_errors<- diag(sqrt(fish_info)) st_errors

I get the following output st_errors [1] NaN NaN 2.9170315888 NaN NaN NaN 0.0294300357 0.0373614751 NaN [10] 0.0785349634 0.0005656580 NaN 0.0470600219 0.0053255251 0.0408666177 0.0001561243 0.4540428740

The NaNs are being produced to a negative value on the diagonal, which should be impossible in a variance-covariance matrix. However, I suspect that it is due to the optimization procedure being not correct.

To be clear, I also include the function I want to optimize. It is a Kalman-filter with updating equations and some restrictions built in.

Kalman_filter<-function(par, y, maturities){

 b0<-c(par[1],par[2],par[3])
 P0<-diag(c(par[4],par[5],par[6]))
 Phi<-diag(c(par[7],par[8],par[9]))
 mu<-c(par[10],par[11],par[12])
 lambda<-par[13]
 sigma11<-par[14]
 sigma21<-par[15]
  sigma22<-par[16]
  sigma33<-par[17]

 m=length(b0)
 n=length(y[,1])
 d<-length(y[1,])





 sigma_eps<-sigma11*diag(d)

 sigma_nu<-diag(c(sigma21^2,sigma22^2,sigma33^2))*(1/12)
 colnames(sigma_nu)<-c("level","slope","Curvat")

X<-matrix(cbind(rep(1,length(maturities)), slope_factor(lambda,maturities), curv_factor(lambda,maturities)),ncol=3) colnames(X)<-c("level","slope","Curvature")

bt<-matrix(NA, nrow=m, ncol=n+1)

Pt<-array(NA, dim=c(m,m,n+1))

btt<-matrix(NA, nrow=m,ncol=n+1)

Ptt<-array(NA, dim=c(m,m,n+1))

vt<-matrix(NA, nrow=d, ncol=n)

eigen_values<-eigen(Phi,only.values=TRUE)$values if(eigen_values[1]>=1||eigen_values[2]>=1||eigen_values[3]>=1){ loglike=-70000000 }else{

c<- (diag(3) - Phi)%*% mu

loglike<-0 i<-1 btt[,1]<-b0 Ptt[,,1]<-P0 while(i< n+1){

bt[,i]<- c+ Phi%*% btt[,i] Pt[,,i] <- Phi%*% tcrossprod(Ptt[,,i],Phi) + sigma_nu

vt[,i]<- y[i,] - X%*% bt[,i]

ft<-X%*% tcrossprod(Pt[,,i], X) + sigma_eps


det_f<-det(ft)

if( is.nan(det_f) || is.na(det_f)|| is.infinite(det_f)){

  loglike<- - 700000000
} else
{
  if(det_f<0){
   loglike <- - 700000000
  } else
  { 
     if (abs(det_f>1e-20)){
      logdet_f<- log(det_f)
      f_inv<- solve(ft, tol=1e-200)
      Kt<- tcrossprod(Pt[,,i],X)%*% f_inv 
      btt[,i+1] <- bt[,i] + Kt%*% vt[,i]
      Ptt[,,i+1] <- (diag(3) - Kt%*% X)%*% Pt[,,i]
      loglike_contr<- -0.5*d*log(2*pi) - 0.5 * logdet_f - 0.5* 
      crossprod(vt[,i],f_inv)%*% vt[,i]

      loglike<-loglike+loglike_contr
    } else
    { loglike<- -700000}
     }

     }

     i<-i+1
     }
     }
     return(-loglike)
     }

Any help would be appreciated.

回答1:

I have just solved the problem, I programmed the likelihood function once more with the only input parameters, the likelihood estimates from optim. After this, I used the hessianfunction from the numDerivpackage. This produces viable estimates for the standard errors.

来源：https://stackoverflow.com/questions/52538573/how-to-adjust-an-odd-behaving-hessian-to-calculate-standard-errors-with-optim

标签

optimization

kalman-filter

hessian