Expand ranges defined by “from” and “to” columns

前端 未结 9 1733
悲哀的现实
悲哀的现实 2020-11-22 07:02

I have a data frame containing \"name\" of U.S. Presidents, the years when they start and end in office, (\"from\" and \"to\" columns

相关标签:
9条回答
  • 2020-11-22 07:26

    Another solution using dplyr and tidyr:

    library(magrittr) # for pipes
    df <- data.frame(tata = c('toto1', 'toto2'), from = c(2000, 2004), to = c(2001, 2009))
    
    #    tata from   to
    # 1 toto1 2000 2001
    # 2 toto2 2004 2009
    
    df %>% 
      dplyr::as.tbl() %>%
      dplyr::rowwise() %>%
      dplyr::mutate(combined = list(seq(from, to))) %>%
      dplyr::select(-from, -to) %>%
      tidyr::unnest(combined)
    
    #   tata  combined
    #   <fct>    <int>
    # 1 toto1     2000
    # 2 toto1     2001
    # 3 toto2     2004
    # 4 toto2     2005
    # 5 toto2     2006
    # 6 toto2     2007
    # 7 toto2     2008
    # 8 toto2     2009
    
    0 讨论(0)
  • 2020-11-22 07:29

    An alternate tidyverse approach using unnest and map2.

    library(tidyverse)
    
    presidents %>%
      unnest(year = map2(from, to, seq)) %>%
      select(-from, -to)
    
    #              name  year
    # 1    Bill Clinton  1993
    # 2    Bill Clinton  1994
    ...
    # 21   Barack Obama  2011
    # 22   Barack Obama  2012
    

    Edit: From tidyr v1.0.0 new variables can no longer be created as part of unnest().

    presidents %>%
      mutate(year = map2(from, to, seq)) %>%
      unnest(year) %>%
      select(-from, -to)
    
    0 讨论(0)
  • 2020-11-22 07:31

    Use by to create a by list L of data.frames, one data.frame per president, and then rbind them together. No packages are used.

    L <- by(presidents, presidents$name, with, data.frame(name, year = from:to))
    do.call("rbind", setNames(L, NULL))
    

    If you don't mind row names then the last line could be reduced to just:

    do.call("rbind", L)
    
    0 讨论(0)
提交回复
热议问题