问题
I'm trying rnbinom
as below
x<- rnbinom(500, mu = 4, size = .1)
xtrunc <- x[x>0]
then I just get 125 observations.
However, I want to make 500 observations excluding 0 (zero) with same condition (mu = 4, size =.1
).
回答1:
This does the job:
N <- 500 ## target number of samples
## set seed for reproducibility
set.seed(0)
## first try
x <- rnbinom(N, mu = 4, size = .1)
p_accept <- mean(success <- x > 0) ## estimated probability of accepting samples
xtrunc <- x[success]
## estimated further sampling times
n_further <- (N - length(xtrunc)) / p_accept
## further sampling
alpha <- 1.5 ## inflation factor
x_further <- rnbinom(round(alpha * n_further), mu = 4, size = .1)
## filter and combine
xtrunc <- c(xtrunc, (x_further[x_further > 0])[seq_len(N - length(xtrunc))])
## checking
length(xtrunc)
# [1] 500
summary(xtrunc)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 1.00 2.00 5.00 12.99 16.00 131.00
In above, sampling takes two stages. The result of the initial stage is used to estimate probability of acceptance rate to guide the second stage sampling.
However, since the underlying distribution is explicitly known, the theoretical probability of acceptance rate is know. Therefore, there is no need to perform a two-stage approach in this case. Try:
p <- 1 - pnbinom(0, mu = 4, size = .1) ## theoretical probability
alpha <- 1.5
n_try <- round(alpha * N / p)
set.seed(0)
x <- rnbinom(n_try, mu = 4, size = .1)
xtrunc <- (x[x > 0])[1:N]
## checking
length(xtrunc)
# [1] 500
summary(xtrunc)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 1.00 2.00 5.00 12.99 16.00 131.00
The idea behind is the theory of geometric distribution. My answer here is closely related. Read the "More efficient vectorized method" section for detailed explanation.
来源:https://stackoverflow.com/questions/39051760/how-to-generate-target-number-of-samples-from-a-distribution-under-a-rejection-c