问题

My data looks like this:

I am trying to make it look like this:

I would like to do this in tidyverse using %>%-chaining.

df <- 
structure(list(id = c(2L, 2L, 4L, 5L, 5L, 5L, 5L), start_end = structure(c(2L, 
1L, 2L, 2L, 1L, 2L, 1L), .Label = c(\"end\", \"start\"), class = \"factor\"), 
    date = structure(c(6L, 7L, 3L, 8L, 9L, 10L, 11L), .Label = c(\"1979-01-03\", 
    \"1979-06-21\", \"1979-07-18\", \"1989-09-12\", \"1991-01-04\", \"1994-05-01\", 
    \"1996-11-04\", \"2005-02-01\", \"2009-09-17\", \"2010-10-01\", \"2012-10-06\"
    ), class = \"factor\")), .Names = c(\"id\", \"start_end\", \"date\"
), row.names = c(3L, 4L, 7L, 8L, 9L, 10L, 11L), class = \"data.frame\")

What I have tried:

data.table::dcast( df, formula = id ~ start_end, value.var = \"date\", drop = FALSE )  # does not work because it summarises the data

tidyr::spread( df, start_end, date )  # does not work because of duplicate values


df$id2 <- 1:nrow(df)
tidyr::spread( df, start_end, date ) # does not work because the dataset now has too many rows.

These questions do not answer my question:

Using spread with duplicate identifiers for rows (because they summarise)

R: spread function on data frame with duplicates (because they paste the values together)

Reshaping data in R with "login" "logout" times (because not specifically asking for/answered using tidyverse and chaining)

回答1:

We can use tidyverse. After grouping by 'start_end', 'id', create a sequence column 'ind' , then spread from 'long' to 'wide' format

library(dplyr)
library(tidyr)
df %>%
   group_by(start_end, id) %>%
   mutate(ind = row_number()) %>%
   spread(start_end, date) %>% 
   select(start, end)
#     id      start        end
#* <int>     <fctr>     <fctr>
#1     2 1994-05-01 1996-11-04
#2     4 1979-07-18         NA
#3     5 2005-02-01 2009-09-17
#4     5 2010-10-01 2012-10-06

Or using tidyr_1.0.0

chop(df, date) %>%
     spread(start_end, date) %>%
     unnest(c(start, end))

来源：https://stackoverflow.com/questions/43259380/spread-with-duplicate-identifiers-using-tidyverse-and

标签

dplyr

tidyr

tidyverse

dcast

Spread with duplicate identifiers (using tidyverse and %>%) [duplicate]

问题

My data looks like this:

I am trying to make it look like this:

What I have tried:

回答1: