This question already has an answer here:
I want to gather two seperate groups of columns into two key-value pairs. Here's some example data:
library(dplyr)
library(tidyr)
ID = c(1:5)
measure1 = c(1:5)
measure2 = c(6:10)
letter1 = c("a", "b", "c", "d", "e")
letter2 = c("f", "g", "h", "i", "j")
df = data.frame(ID, measure1, measure2, letter1, letter2)
df = tbl_df(df)
df$letter1 <- as.character(df$letter1)
df$letter2 <- as.character(df$letter2)
I want the values of the two measure columns (measure1 and measure2) to be in one column with a key-column next to it (the key-value pair). I also want the same for letter1 and letter2. I figured that I could use select() to create two different datasets, use gather seperately on both datasets and then join (this worked):
df_measure = df %>%
select(ID, measure1, measure2) %>%
gather(measure_time, measure, -ID) %>%
mutate(id.extra = c(1:10))
df_letter = df %>%
select(ID, letter1, letter2) %>%
gather(letter_time, letter, -ID) %>%
mutate(id.extra = c(1:10))
df_long = df_measure %>%
left_join(df_letter, by = "id.extra")
So this works perfectly (in this case), but i guess this could be done more elegantly (without stuff like splitting or creating 'id.extra').So please shed some light on it!
You can use something like the following. I'm not sure from your current approach if this is exactly your desired output or not since it seems to contain a lot of redundant information.
df %>%
gather(val, var, -ID) %>%
extract(val, c("value", "time"), regex = "([a-z]+)([0-9]+)") %>%
spread(value, var)
# # A tibble: 10 × 4
# ID time letter measure
# * <int> <chr> <chr> <chr>
# 1 1 1 a 1
# 2 1 2 f 6
# 3 2 1 b 2
# 4 2 2 g 7
# 5 3 1 c 3
# 6 3 2 h 8
# 7 4 1 d 4
# 8 4 2 i 9
# 9 5 1 e 5
# 10 5 2 j 10
This is much more easily done with melt
+ patterns
from "data.table":
library(data.table)
melt(as.data.table(df), measure.vars = patterns("measure", "letter"))
Or you can be old-school and just use reshape
from base R. Note, however, that base R's reshape
does not like "tibbles", so you have to convert it with as.data.frame
).
reshape(as.data.frame(df), direction = "long", idvar = "ID",
varying = 2:ncol(df), sep = "")
We can use melt
from data.table
which can take multiple measure
patterns
library(data.table)
melt(setDT(df), measure = patterns("^measure", "^letter"),
value.name = c("measure", "letter"))
# ID variable measure letter
# 1: 1 1 1 a
# 2: 2 1 2 b
# 3: 3 1 3 c
# 4: 4 1 4 d
# 5: 5 1 5 e
# 6: 1 2 6 f
# 7: 2 2 7 g
# 8: 3 2 8 h
# 9: 4 2 9 i
#10: 5 2 10 j
来源:https://stackoverflow.com/questions/43293951/using-gather-to-gather-two-or-more-groups-of-columns-into-two-or-more-key