问题
I would like to generate some large random multivariate (more than 6 dimensions) normal samples. In R, many packages can do this such as rmnorm, rmvn... But the problem is the speed! So I tried to write some C code through Rcpp. I went through some tutorial online but it seems there is no "sugar" for multivariate distribution, neither in STL library.
Any help is appreciated!
Thanks!
回答1:
I'm not sure that Rcpp will help unless you find a good algorithm to approximate your multivariate (cholesky, svd, etc.) and program it using Eigen (RccpEigen) or Armadillo (using RcppArmadillo).
Here is one approach using the Cholesky decomposition and (Rcpp)Armadillo
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
using namespace arma;
using namespace Rcpp;
mat mvrnormArma(int n, mat sigma) {
int ncols = sigma.n_cols;
mat Y = randn(n, ncols);
return Y * chol(sigma);
}
Now a naive implementation in pure R
mvrnormR <- function(n, sigma) {
ncols <- ncol(sigma)
matrix(rnorm(n * ncols), ncol = ncols) %*% chol(sigma)
}
You can also check if everythings work
sigma <- matrix(c(1, 0.9, -0.3, 0.9, 1, -0.4, -0.3, -0.4, 1), ncol = 3)
cor(mvrnormR(100, sigma))
cor(MASS::mvrnorm(100, mu = rep(0, 3), sigma))
cor(mvrnormArma(100, sigma))
Now let's benchmark it
require(bencharmk)
benchmark(mvrnormR(1e4, sigma),
MASS::mvrnorm(1e4, mu = rep(0, 3), sigma),
mvrnormArma(1e4, sigma),
columns=c('test', 'replications', 'relative', 'elapsed'))
## 2 MASS::mvrnorm(10000, mu = rep(0, 3), sigma) 100
## 3 mvrnormArma(10000, sigma) 100
## 1 mvrnormR(10000, sigma) 100
## relative elapsed
## 2 3.135 2.295
## 3 1.000 0.732
## 1 1.807 1.323
In this example I used a normal distribution with unit variance and null mean but you could easily generalize to gaussian distribution with custom mean and variance.
Hope this helps
来源:https://stackoverflow.com/questions/15263996/rcpp-how-to-generate-random-multivariate-normal-vector-in-rcpp