Stan version of a JAGS model which includes a sum of discrete values - Is it possible?

问题

I was trying to run this model in Stan. I have a running JAGS version of it (that returns highly autocorrelated parameters) and I know how to formulate it as CDF of a double exponential (with two rates), which would probably run without problems. However, I would like to use this version as a starting point for similar but more complex models.

By now I have the suspicion that a model like this is not possible in Stan. Maybe because of the discreteness introduces by taking the sum of a Boolean value, Stan may not be able to calculate gradients.

Does anyone know whether this is the case, or if I do something else in a wrong way in this model? I paste the errors I get below the model code.

Many thanks in advance Jan

Model:

data {
    int y[11]; 
    int reps[11];
    real soas[11]; 

}
parameters {
    real<lower=0.001,upper=0.200> v1;
    real<lower=0.001,upper=0.200> v2;

}


model {
    real dif[11,96];
    real cf[11];

    real p[11];

    real t1[11,96];
    real t2[11,96];

    for (i in 1:11){
        for (r in 1:reps[i]){     
            t1[i,r]  ~ exponential(v1);
            t2[i,r]  ~ exponential(v2);
            dif[i,r] <-  (t1[i,r]+soas[i]<=(t2[i,r]));

            }
        cf[i] <- sum(dif[i]);
        p[i]  <-cf[i]/reps[i];
        y[i] ~ binomial(reps[i],p[i]); 
    }

}

Here is some dummy data:

psy_dat = { 
         'soas' :  numpy.array(range(-100,101,20)),
            'y' :  [47, 46, 62, 50, 59, 47, 36, 13, 7, 2, 1],
         'reps' :  [48, 48, 64, 64, 92, 92, 92, 64, 64, 48, 48]
          }

And here are the errors:

DIAGNOSTIC(S) FROM PARSER:
Warning (non-fatal): Left-hand side of sampling statement (~) contains a non-linear transform of a parameter or local variable.
You must call increment_log_prob() with the log absolute determinant of the Jacobian of the transform.
Sampling Statement left-hand-side expression:
get_base1(get_base1(t1,i,"t1",1),r,"t1",2) ~ exponential_log(...)
Warning (non-fatal): Left-hand side of sampling statement (~) contains a non-linear transform of a parameter or local variable.
You must call increment_log_prob() with the log absolute determinant of the Jacobian of the transform.
Sampling Statement left-hand-side expression:
get_base1(get_base1(t2,i,"t2",1),r,"t2",2) ~ exponential_log(...)

And at runtime:

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
 stan::prob::exponential_log(N4stan5agrad3varE): Random variable is nan:0, but must not be nan!
 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
 Rejecting proposed initial value with zero density.


Initialization between (-2, 2) failed after 100 attempts. 
 Try specifying initial values, reducing ranges of constrained values, or   reparameterizing the model

Here is a working JAGS version of this model:

   model {
   for ( n in 1 : N  ) { 
     for (r in 1 : reps[n]){
       t1[r,n] ~ dexp(v1)
       t2[r,n] ~ dexp(v2)
       c[r,n] <- (1.0*((t1[r,n]+durs[n])<=t2[r,n]))
     } 
     p[n] <- max((min(sum(c[,n]) /  (reps[n]),0.99999999999999)),   1-0.99999999999999)) 
     y[n] ~ dbin(p[n],reps[n])
   }

   v1 ~ dunif(0.0001,0.2)
   v2 ~ dunif(0.0001,0.2)
   }

With regard to the min() and max(): See this post https://stats.stackexchange.com/questions/130978/observed-node-inconsistent-when-binomial-success-rate-exactly-one?noredirect=1#comment250046_130978.

回答1:

I'm still not sure what model you are trying to estimate (it would be best if you post the JAGS code) but what you have above cannot work in Stan. Stan is closer to C++ in the sense that you have to declare and then define objects. In your Stan program, you have the two declarations real t1[11,96]; real t2[11,96]; but no definitions of t1 or t2. Consequently, they are initalized to NaN and when you do t1[i,r] ~ exponential(v1); that gets parsed as something like for(i in 1:11) for(r in 1:reps[i]) lp__ += log(v1) - v1 * NaN where lp__ is an internal symbol that holds value of the log-posterior, which becomes NaN and it cannot do Metropolis-style updates of the parameters.

Perhaps you meant for t1 and t2 to be unknown parameters, in which case they should be declared in the parameters block. The following [EDITED] Stan program is valid and should work, but it might not be the program you had in mind (it does not make a lot of sense to me and the discontinuity in dif will probably preclude Stan from sampling from this posterior distribution efficiently). data { int<lower=1> N; int y[N]; int reps[N]; real soas[N]; } parameters { real<lower=0.001,upper=0.200> v1; real<lower=0.001,upper=0.200> v2; real t1[N,max(reps)]; real t2[N,max(reps)]; } model { for (i in 1:N) { real dif[reps[i]]; for (r in 1:reps[i]) { dif[r] <- (t1[i,r]+soas[i]) <= t2[i,r]; } y[i] ~ binomial(reps[i], (1.0 + sum(dif)) / (1.0 + reps[i])); } to_array_1d(t1) ~ exponential(v1); to_array_1d(t2) ~ exponential(v2); }

来源：https://stackoverflow.com/questions/27963759/stan-version-of-a-jags-model-which-includes-a-sum-of-discrete-values-is-it-pos

标签

bayesian

mcmc

stan