How do I calculate correlations between one column and all other columns in a data frame in R without using column names? I tried to use ddply and it works if I use just two col
As of
packageVersion("dplyr")
[1] ‘1.0.2’
The result of the code suggested in one of the answers returns a tibble
iris %>%
group_by(Species) %>%
do(cormat = cor(select(., -matches("Species"))))
# A tibble: 3 x 2
# Rowwise:
Species cormat
1 setosa
2 versicolor
3 virginica
To get the data into a rectangular shape, you can
iris_cor <- iris %>%
group_by(Species) %>%
do(cormat = cor(select(., -matches("Species")))) %>%
pull(cormat) %>% melt
You will have the levels of Species codified on L1
variable.
Var1 Var2 value L1
1 Sepal.Length Sepal.Length 1.0000000 1
2 Sepal.Width Sepal.Length 0.7425467 1
3 Petal.Length Sepal.Length 0.2671758 1
4 Petal.Width Sepal.Length 0.2780984 1
...
I am sure there's a cleaner way of doing this with unnest()
and its friends, but couldn't figure out yet. Hoping this gets noticed
and posts a better solution