within-group differences from group member

前端 未结 2 981
遥遥无期
遥遥无期 2021-01-16 12:59

I have measurements for different treatments of an experiment that ran over several rounds, like so:

set.seed(1)
df <- data.frame(treatment = rep(c(\'base         


        
相关标签:
2条回答
  • 2021-01-16 13:25

    You can use mutate_each for that:

    mydf %>%
      group_by(round) %>%
      mutate_each(funs(. - .[treatment=="baseline"]), -treatment) %>%
      filter(treatment!="baseline")
    

    which gives:

    Source: local data frame [10 x 4]
    Groups: round [5]
    
        treatment round measurement1 measurement2
           (fctr) (int)        (dbl)        (dbl)
    1  treatment1     1     1.558820   -0.6584485
    2  treatment2     1    -0.068677    1.3364462
    3  treatment1     2     1.769312   -0.2732490
    4  treatment2     2     0.801357   -1.4852449
    5  treatment1     3    -1.064394   -1.1513703
    6  treatment2     3     2.433222   -0.7939903
    7  treatment1     4     0.448744    0.1394982
    8  treatment2     4    -1.066922   -1.1410085
    9  treatment1     5     1.182761   -0.8311095
    10 treatment2     5     0.138005    0.2622119
    

    If you want to add the differences to your dataframe (just as @akrun did in his dplyr / tidyr alternative), you could also do:

    mydf %>%
      group_by(round) %>%
      mutate(diff1 = measurement1 - measurement1[treatment=="baseline"],
             diff2 = measurement2 - measurement2[treatment=="baseline"]) %>%
      filter(treatment!="baseline")
    

    which gives:

    Source: local data table [10 x 6]
    
        treatment round measurement1 measurement2     diff1      diff2
           (fctr) (int)        (dbl)        (dbl)     (dbl)      (dbl)
    1  treatment1     1     2.630392    -0.104258  1.558820 -0.6584485
    2  treatment2     1     1.002895     1.890637 -0.068677  1.3364462
    3  treatment1     2     3.822473     3.147443  1.769312 -0.2732490
    4  treatment2     2     2.854518     1.935447  0.801357 -1.4852449
    5  treatment1     3     1.520553     3.291122 -1.064394 -1.1513703
    6  treatment2     3     5.018169     3.648502  2.433222 -0.7939903
    7  treatment1     4     4.956380     4.544908  0.448744  0.1394982
    8  treatment2     4     3.440714     3.264401 -1.066922 -1.1410085
    9  treatment1     5     4.672056     5.082310  1.182761 -0.8311095
    10 treatment2     5     3.627300     6.175631  0.138005  0.2622119
    
    0 讨论(0)
  • 2021-01-16 13:31

    We can use data.table

    library(data.table)
    setDT(df)[order(round,treatment), tail(.SD,2)- head(.SD,1)[rep(1,2)],
                     round , .SDcols=3:4]
    

    Or another option with data.table is

    setDT(df)[, lapply(.SD[, grep("^measurement", names(.SD)),
        with =FALSE], function(x) x[treatment!="baseline"]- 
          x[treatment=="baseline"]) , round]
    

    Or using dplyr/tidyr

     library(dplyr)
     library(tidyr)
     gather(df, var, val, measurement1:measurement2) %>% 
              spread(treatment, val) %>% 
              mutate(diff1 = `treatment 1` - baseline, 
                     diff2 = `treatment 2` - baseline)
    
    0 讨论(0)
提交回复
热议问题