The idea here is to gather()
all the time variables (all variables but Subject
), use separate()
on key
to split them into a label
and a time
and then spread()
the label
and value
to obtain your desired output.
library(dplyr)
library(tidyr)
sample.df %>%
gather(key, value, -Subject) %>%
separate(key, into = c("label", "time"), "(?<=[a-z])(?=[0-9])") %>%
spread(label, value)
Which gives:
# Subject time BlueTime GreenTime RedTime
#1 1 1 2 2 2
#2 1 2 4 4 4
#3 1 3 1 1 1
#4 2 1 5 5 5
#5 2 2 6 6 6
#6 2 3 2 2 2
#7 3 1 6 6 6
#8 3 2 7 7 7
#9 3 3 3 3 3
Note
Here we use the regex
in separate()
from this answer by @RichardScriven to split the column on the first encountered digit.
Edit
I understand from your comments that your dataset column names are actually in the form ColorTime_Pre
, ColorTime_Post
, ColorTime_Final
. If that is the case, you don't have to specify a regex in separate()
as the default one sep = "[^[:alnum:]]+"
will match your _
and split the key into label
and time
accordingly:
sample.df %>%
gather(key, value, -Subject) %>%
separate(key, into = c("label", "time")) %>%
spread(label, value)
Will give:
# Subject time BlueTime GreenTime RedTime
#1 1 Final 1 1 1
#2 1 Post 4 4 4
#3 1 Pre 2 2 2
#4 2 Final 2 2 2
#5 2 Post 6 6 6
#6 2 Pre 5 5 5
#7 3 Final 3 3 3
#8 3 Post 7 7 7
#9 3 Pre 6 6 6