R: calculate variance for data$V1 for each different value in data$V2

后端 未结 5 1455
南笙
南笙 2020-12-19 14:53

I have data frame looking like this

V1   V2
..   1
..   2
..   1
..   3

etc.

For each distinct V2 value i would like to calculate v

相关标签:
5条回答
  • 2020-12-19 15:28

    And the old standby, tapply:

    dat <- data.frame(x = runif(50), y = rep(letters[1:5],each = 10))
    tapply(dat$x,dat$y,FUN = var)
    
             a          b          c          d          e 
    0.03907351 0.10197081 0.08036828 0.03075195 0.08289562 
    
    0 讨论(0)
  • 2020-12-19 15:37

    Another solution using data.table. It is a lot faster, especially useful when you have large data sets.

    require(data.table)
    dat2 = data.table(dat)
    ans  = dat2[,list(variance = var(V1)),'V2']
    
    0 讨论(0)
  • 2020-12-19 15:39

    There are a few ways to do this, I prefer:

    dat <- data.frame(V1 = rnorm(50), V2=rep(1:5,10))
    dat
    
    aggregate (V1~V2, data=dat, var) # The first argument tells it to group V1 based on the values in V2, the last argument simply tells it the function to apply.
    
    > aggregate (V1~V2, data=dat, var)
      V2        V1
    1  1 0.9139360
    2  2 1.6222236
    3  3 1.2429743
    4  4 1.1889356
    5  5 0.7000294
    

    Also look into ddply, daply etc in the plyr package.

    0 讨论(0)
  • 2020-12-19 15:42
    library(reshape)
    ddply(data, .(V2), summarise, variance=var(V1))
    
    0 讨论(0)
  • 2020-12-19 15:42

    Using dplyr you can do

    library(dplyr)
    data %>%
      group_by(V2) %>%
      summarize(var = var(V1))
    

    Here we group by the unique values of V2 and find the variance of V1 for each group.

    0 讨论(0)
提交回复
热议问题