Generate correlated random numbers from binomial distributions

前端 未结 2 1313
时光取名叫无心
时光取名叫无心 2020-11-30 08:23

I am trying to find a way to generate correlated random numbers from several binomial distributions.

I know how to do it with normal distributions (using MASS:

相关标签:
2条回答
  • 2020-11-30 08:30

    A binomial variable with n trials and probability p of success in each trial can be viewed as the sum of n Bernoulli trials each also having probability p of success.

    Similarly, you can construct pairs of correlated binomial variates by summing up pairs of Bernoulli variates having the desired correlation r.

    require(bindata)
    
    # Parameters of joint distribution
    size <- 20
    p1 <- 0.5
    p2 <- 0.3
    rho<- 0.2
    
    # Create one pair of correlated binomial values
    trials <- rmvbin(size, c(p1,p2), bincorr=(1-rho)*diag(2)+rho)
    colSums(trials)
    
    # A function to create n correlated pairs
    rmvBinomial <- function(n, size, p1, p2, rho) {
        X <- replicate(n, {
                 colSums(rmvbin(size, c(p1,p2), bincorr=(1-rho)*diag(2)+rho))
             })
        t(X)
    }
    # Try it out, creating 1000 pairs
    X <- rmvBinomial(1000, size=size, p1=p1, p2=p2, rho=rho)
    #     cor(X[,1], X[,2])
    # [1] 0.1935928  # (In ~8 trials, sample correlations ranged between 0.15 & 0.25)
    

    It's important to note that there are many different joint distributions that share the desired correlation coefficient. The simulation method in rmvBinomial() produces one of them, but whether or not it's the appropriate one will depend on the process that's generating you data.

    As noted in this R-help answer to a similar question (which then goes on to explain the idea in more detail) :

    while a bivariate normal (given means and variances) is uniquely defined by the correlation coefficient, this is not the case for a bivariate binomial

    0 讨论(0)
  • 2020-11-30 08:42

    You can generate correlated uniforms using the copula package, then use the qbinom function to convert those to binomial variables. Here is one quick example:

    library(copula)
    
    tmp <- normalCopula( 0.75, dim=2 )
    x <- rcopula(tmp, 1000)
    x2 <- cbind( qbinom(x[,1], 10, 0.5), qbinom(x[,2], 15, 0.7) )
    

    Now x2 is a matrix with the 2 columns representing 2 binomial variables that are correlated.

    0 讨论(0)
提交回复
热议问题