I have measurements for different treatments of an experiment that ran over several rounds, like so:
set.seed(1)
df <- data.frame(treatment = rep(c(\'base
You can use mutate_each
for that:
mydf %>%
group_by(round) %>%
mutate_each(funs(. - .[treatment=="baseline"]), -treatment) %>%
filter(treatment!="baseline")
which gives:
Source: local data frame [10 x 4]
Groups: round [5]
treatment round measurement1 measurement2
(fctr) (int) (dbl) (dbl)
1 treatment1 1 1.558820 -0.6584485
2 treatment2 1 -0.068677 1.3364462
3 treatment1 2 1.769312 -0.2732490
4 treatment2 2 0.801357 -1.4852449
5 treatment1 3 -1.064394 -1.1513703
6 treatment2 3 2.433222 -0.7939903
7 treatment1 4 0.448744 0.1394982
8 treatment2 4 -1.066922 -1.1410085
9 treatment1 5 1.182761 -0.8311095
10 treatment2 5 0.138005 0.2622119
If you want to add the differences to your dataframe (just as @akrun did in his dplyr / tidyr alternative), you could also do:
mydf %>%
group_by(round) %>%
mutate(diff1 = measurement1 - measurement1[treatment=="baseline"],
diff2 = measurement2 - measurement2[treatment=="baseline"]) %>%
filter(treatment!="baseline")
which gives:
Source: local data table [10 x 6]
treatment round measurement1 measurement2 diff1 diff2
(fctr) (int) (dbl) (dbl) (dbl) (dbl)
1 treatment1 1 2.630392 -0.104258 1.558820 -0.6584485
2 treatment2 1 1.002895 1.890637 -0.068677 1.3364462
3 treatment1 2 3.822473 3.147443 1.769312 -0.2732490
4 treatment2 2 2.854518 1.935447 0.801357 -1.4852449
5 treatment1 3 1.520553 3.291122 -1.064394 -1.1513703
6 treatment2 3 5.018169 3.648502 2.433222 -0.7939903
7 treatment1 4 4.956380 4.544908 0.448744 0.1394982
8 treatment2 4 3.440714 3.264401 -1.066922 -1.1410085
9 treatment1 5 4.672056 5.082310 1.182761 -0.8311095
10 treatment2 5 3.627300 6.175631 0.138005 0.2622119
We can use data.table
library(data.table)
setDT(df)[order(round,treatment), tail(.SD,2)- head(.SD,1)[rep(1,2)],
round , .SDcols=3:4]
Or another option with data.table
is
setDT(df)[, lapply(.SD[, grep("^measurement", names(.SD)),
with =FALSE], function(x) x[treatment!="baseline"]-
x[treatment=="baseline"]) , round]
Or using dplyr/tidyr
library(dplyr)
library(tidyr)
gather(df, var, val, measurement1:measurement2) %>%
spread(treatment, val) %>%
mutate(diff1 = `treatment 1` - baseline,
diff2 = `treatment 2` - baseline)