Calculate correlation for more than two variables?

前端 未结 5 1659
眼角桃花
眼角桃花 2021-01-30 02:36

I use the following method to calculate a correlation of my dataset:

cor( var1, var2, method = \"method\")

But I like to create a correlation m

相关标签:
5条回答
  • 2021-01-30 02:58

    See corr.test function in psych package:

    > corr.test(mtcars[1:4])
    Call:corr.test(x = mtcars[1:4])
    Correlation matrix 
           mpg   cyl  disp    hp
    mpg   1.00 -0.85 -0.85 -0.78
    cyl  -0.85  1.00  0.90  0.83
    disp -0.85  0.90  1.00  0.79
    hp   -0.78  0.83  0.79  1.00
    Sample Size 
         mpg cyl disp hp
    mpg   32  32   32 32
    cyl   32  32   32 32
    disp  32  32   32 32
    hp    32  32   32 32
    Probability value 
         mpg cyl disp hp
    mpg    0   0    0  0
    cyl    0   0    0  0
    disp   0   0    0  0
    hp     0   0    0  0
    

    And yet another shameless self-advert: https://gist.github.com/887249

    0 讨论(0)
  • 2021-01-30 03:05

    Use the same function (cor) on a data frame, e.g.:

    > cor(VADeaths)
                 Rural Male Rural Female Urban Male Urban Female
    Rural Male    1.0000000    0.9979869  0.9841907    0.9934646
    Rural Female  0.9979869    1.0000000  0.9739053    0.9867310
    Urban Male    0.9841907    0.9739053  1.0000000    0.9918262
    Urban Female  0.9934646    0.9867310  0.9918262    1.0000000
    

    Or, on a data frame also holding discrete variables, (also sometimes referred to as factors), try something like the following:

    > cor(mtcars[,unlist(lapply(mtcars, is.numeric))])
                mpg        cyl       disp         hp        drat         wt        qsec         vs          am       gear        carb
    mpg   1.0000000 -0.8521620 -0.8475514 -0.7761684  0.68117191 -0.8676594  0.41868403  0.6640389  0.59983243  0.4802848 -0.55092507
    cyl  -0.8521620  1.0000000  0.9020329  0.8324475 -0.69993811  0.7824958 -0.59124207 -0.8108118 -0.52260705 -0.4926866  0.52698829
    disp -0.8475514  0.9020329  1.0000000  0.7909486 -0.71021393  0.8879799 -0.43369788 -0.7104159 -0.59122704 -0.5555692  0.39497686
    hp   -0.7761684  0.8324475  0.7909486  1.0000000 -0.44875912  0.6587479 -0.70822339 -0.7230967 -0.24320426 -0.1257043  0.74981247
    drat  0.6811719 -0.6999381 -0.7102139 -0.4487591  1.00000000 -0.7124406  0.09120476  0.4402785  0.71271113  0.6996101 -0.09078980
    wt   -0.8676594  0.7824958  0.8879799  0.6587479 -0.71244065  1.0000000 -0.17471588 -0.5549157 -0.69249526 -0.5832870  0.42760594
    qsec  0.4186840 -0.5912421 -0.4336979 -0.7082234  0.09120476 -0.1747159  1.00000000  0.7445354 -0.22986086 -0.2126822 -0.65624923
    vs    0.6640389 -0.8108118 -0.7104159 -0.7230967  0.44027846 -0.5549157  0.74453544  1.0000000  0.16834512  0.2060233 -0.56960714
    am    0.5998324 -0.5226070 -0.5912270 -0.2432043  0.71271113 -0.6924953 -0.22986086  0.1683451  1.00000000  0.7940588  0.05753435
    gear  0.4802848 -0.4926866 -0.5555692 -0.1257043  0.69961013 -0.5832870 -0.21268223  0.2060233  0.79405876  1.0000000  0.27407284
    carb -0.5509251  0.5269883  0.3949769  0.7498125 -0.09078980  0.4276059 -0.65624923 -0.5696071  0.05753435  0.2740728  1.00000000
    
    0 讨论(0)
  • 2021-01-30 03:06

    If you would like to combine the matrix with some visualisations I can recommend (I am using the built in iris dataset):

    library(psych)
    pairs.panels(iris[1:4])  # select columns 1-4
    

    The Performance Analytics basically does the same but includes significance indicators by default.

    library(PerformanceAnalytics)
    chart.Correlation(iris[1:4])
    

    Or this nice and simple visualisation:

    library(corrplot)
    x <- cor(iris[1:4])
    corrplot(x, type="upper", order="hclust")
    

    0 讨论(0)
  • 2021-01-30 03:17

    You can also calculate correlations for all variables but exclude selected ones, for example:

    mtcars <- data.frame(mtcars)
    # here we exclude gear and carb variables
    cors <- cor(subset(mtcars, select = c(-gear,-carb)))
    

    Also, to calculate correlation between each variable and one column you can use sapply()

    # sapply effectively calls the corelation function for each column of mtcars and mtcars$mpg
    cors2 <- sapply(mtcars, cor, y=mtcars$mpg)
    
    0 讨论(0)
  • 2021-01-30 03:19

    You might want to look at Quick-R, which has a lot of nice little tutorials on how you can do basic statistics in R. For example on correlations:

    http://www.statmethods.net/stats/correlations.html

    0 讨论(0)
提交回复
热议问题