I want to combine two variables into one with a date format

后端 未结 5 885
再見小時候
再見小時候 2021-01-26 22:59

I have a data set with a character column for months (MONTH) and a numeric column indicating years (YEAR). In order to work with it as panel data, I ne

相关标签:
5条回答
  • 2021-01-26 23:18

    You could simplify the below, but it makes it easier to see what's going on:

    library(lubridate)
    library(tidyverse)
    
    df2 <- df %>% 
      mutate(TIME = parse_date_time(paste0(MONTH, YEAR), orders = "%b%Y"),
             TIME = as.character(substr(TIME, 6, 7)),
             TIME = paste0(TIME, "-", YEAR))
    

    This is using lubridate - the easiest way to parse dates in R IMO, dplyr from tidyverse and substr from base R.

    If you want to keep the date column then just pipe in another mutate and call the new column something different.

    0 讨论(0)
  • 2021-01-26 23:27

    If you wish to use a full-on Tidyverse solution, consider this combination of tidyr, and lubridate's parse_date_time:

    library(tidyverse)
    df <- tibble::tribble(
      ~STATE,      ~MONTH,      ~YEAR,   ~VALUE,
    "California",     "JAN",      2018,      800,
    "California",     "FEB",      2018,      780,
    "California",     "MAR",      2018,      600,
    "Minesota",       "JAN",      2018,      800,
    "Minesota",       "FEB",      2018,      780,
    "Minesota",       "MAR",      2018,      600)
    
    df %>%
       tidyr::unite(TIME, c(MONTH, YEAR), sep = "-") %>%
       dplyr::mutate(TIME = lubridate::parse_date_time(TIME, "my"))
    #> # A tibble: 6 x 3
    #>   STATE      TIME                VALUE
    #>   <chr>      <dttm>              <dbl>
    #> 1 California 2018-01-01 00:00:00   800
    #> 2 California 2018-02-01 00:00:00   780
    #> 3 California 2018-03-01 00:00:00   600
    #> 4 Minesota   2018-01-01 00:00:00   800
    #> 5 Minesota   2018-02-01 00:00:00   780
    #> 6 Minesota   2018-03-01 00:00:00   600
    

    Also check out the following related question: Converting year and month ("yyyy-mm" format) to a date?

    0 讨论(0)
  • 2021-01-26 23:30

    Combining Tim's response with an easy to use date package lubridate we get:

    # This can handle months of JAN, FEB, ETC. Or it can handle months of 01,02,etc.
    df$TIME <- lubridate::ymd(paste0(df$YEAR,df$MONTH,"01")) 
    
    # or if you need it in MM-YYYY format:
    df$TIME <- format(lubridate::ymd(paste0(df$YEAR,df$MONTH,"01")), "%m-%Y")
    
    0 讨论(0)
  • 2021-01-26 23:32

    I would recommend handling this by going through bona-fide R dates, using as.Date to generate an R date, then using format to render the string output you want. Something like this:

    df$TIME <- format(as.Date(paste0(df$MONTH, df$YEAR, "01"), format="%b%Y%d"), "%m-%Y")
    

    I arbitrarily assign the first to each date in your data set, but this doesn't matter, because the call to format only includes the month and year.

    0 讨论(0)
  • 2021-01-26 23:42

    In base R you could do something like:

    transform(df,TIME = paste(sprintf('%02d',match(MONTH,toupper(month.abb))),YEAR,sep = '-'))[c(1,5,4)]
           STATE    TIME VALUE
    1 California 01-2018   800
    2 California 02-2018   780
    3 California 03-2018   600
    4        ...  NA-...   ...
    5   Minesota 01-2018   800
    6   Minesota 02-2018   780
    7   Minesota 03-2018   600
    
    0 讨论(0)
提交回复
热议问题