问题
I have a variable with a given distribution (normale in my below example).
set.seed(32)
var1 = rnorm(100,mean=0,sd=1)
I want to create a variable (var2) that is correlated to var1 with a linear correlation coefficient (roughly or exactly) equals to "Corr". The slope of regression between var1 and var2 should (rougly or exactly) equals 1.
Corr = 0.3
How can I achieve this?
I wanted to do something like this:
decorelation = rnorm(100,mean=0,sd=1-Corr)
var2 = var1 + decorelation
But of course when running:
cor(var1,var2)
The result is not close to Corr!
回答1:
I did something similar a while ago. I am pasting some code that is for 3 correlated variables but it can be easily generalized to something more complex.
Create an F matrix first:
cor_Matrix <- matrix(c (1.00, 0.90, 0.20 ,
0.90, 1.00, 0.40 ,
0.20, 0.40, 1.00),
nrow=3,ncol=3,byrow=TRUE)
This can be an arbitrary correlation matrix.
library(psych)
fit<-principal(cor_Matrix, nfactors=3, rotate="none")
fit$loadings
loadings<-matrix(fit$loadings[1:3, 1:3],nrow=3,ncol=3,byrow=F)
loadings
#create three rannor variable
cases <- t(replicate(3, rnorm(3000)) ) #edited, changed to 3000 cases from 150 cases
multivar <- loadings %*% cases
T_multivar <- t(multivar)
var<-as.data.frame(T_multivar)
cor(var)
Again, this can be generalized. You approach listed above does not create a multivariate data set.
来源:https://stackoverflow.com/questions/17047033/constructing-correlated-variables