问题
I would like to used nested full_join to merge several data frames together. In addition, I am hoping to be able to add suffixes to all of the columns so that when the data frames are merged each column name indicates which data frame it came from (e.g., a unique time identifier like T1, T2, ...).
x <- data.frame(i = c("a","b","c"), j = 1:3, h = 1:3, stringsAsFactors=FALSE)
y <- data.frame(i = c("b","c","d"), k = 4:6, h = 1:3, stringsAsFactors=FALSE)
z <- data.frame(i = c("c","d","a"), l = 7:9, h = 1:3, stringsAsFactors=FALSE)
full_join(x, y, by='i') %>% left_join(., z, by='I')
Is there a way to integrate the default suffix option so that I get a dataset with column names that look like:
column_names <- c("i", "j_T1", "h_T1", "k_T2", "h_T2", "l_T3", "h_T3")
回答1:
I think this can be done by working with the column headers using purrr but I've used pivot_wider and pivot_longer to change the header names:
df <- x %>%
full_join(y, by = "i") %>%
full_join(z, by = "i") %>%
pivot_longer(cols = -i,
names_to = "columns",
values_to = "values") %>% # makes the column headers into a column
which can be changed
mutate(columns = str_replace(columns, ".x", "_T2"),
columns = str_replace(columns, ".y", "_T3"),
columns = case_when(!str_detect(columns, "T") ~ paste0(columns, "_T1"),
TRUE ~ columns)) %>%
pivot_wider(names_from = columns,
values_from = values)
These don't match the listed headers but hopefully this code will help to get you started if the order is important and column l should be T3 (there was only 1 in this example).
来源:https://stackoverflow.com/questions/65152352/suffixes-when-merging-more-than-two-data-frames-with-full-join