Convert integer as “20160119” to different columns of “day” “year” “month”

后端 未结 5 1419
走了就别回头了
走了就别回头了 2020-12-06 15:30

How can I convert a column of integers as dates:

       DATE PRCP
1: 19490101   25
2: 19490102    5
3: 19490118   18
4: 19490119  386
5: 19490202   38


        
相关标签:
5条回答
  • 2020-12-06 16:03

    First I would convert the DATE column to Date type using as.Date(), then build the new data.frame using calls to format():

    df <- data.frame(DATE=c(19490101,19490102,19490118,19490119,19490202),PRCP=c(25,5,18,386,38),stringsAsFactors=F);
    df$DATE <- as.Date(as.character(df$DATE),'%Y%m%d');
    data.frame(day=as.integer(format(df$DATE,'%d')),month=as.integer(format(df$DATE,'%m')),year=as.integer(format(df$DATE,'%Y')),PRCP=df$PRCP);
    ##   day month year PRCP
    ## 1   1     1 1949   25
    ## 2   2     1 1949    5
    ## 3  18     1 1949   18
    ## 4  19     1 1949  386
    ## 5   2     2 1949   38
    
    0 讨论(0)
  • 2020-12-06 16:14

    I would advise you to use the lubridate package:

    require(lubridate)
    df[, DATE := ymd(DATE)]
    df[, c("Day", "Month", "Year") := list(day(DATE), month(DATE), year(DATE))]
    df[, DATE := NULL]
    
    0 讨论(0)
  • 2020-12-06 16:27

    We can use extract

    library(tidyr)
    extract(df, DATE, into=c('YEAR', 'MONTH', 'DAY'), 
             '(.{4})(.{2})(.{2})', remove=FALSE)
    #       DATE YEAR MONTH DAY PRCP
    #1 19490101 1949    01  01   25
    #2 19490102 1949    01  02    5
    #3 19490118 1949    01  18   18
    #4 19490119 1949    01  19  386
    #5 19490202 1949    02  02   38
    
    0 讨论(0)
  • 2020-12-06 16:27

    Here's another way using regular expressions:

    df <- read.table(header=T, stringsAsFactors=F, text="
    DATE PRCP
    19490101   25
    19490102    5
    19490118   18
    19490119  386
    19490202   38")
    dates <- as.character(df$DATE)
    res <- t(sapply(regmatches(dates, regexec("(\\d{4})(\\d{2})(\\d{2})", dates)), "[", -1))
    res <- structure(as.integer(res), .Dim=dim(res)) # make them integer values
    cbind(df, setNames(as.data.frame(res), c("Y", "M", "D"))) # combine with original data frame
    #       DATE PRCP    Y  M  D
    # 1 19490101   25 1949 01 01
    # 2 19490102    5 1949 01 02
    # 3 19490118   18 1949 01 18
    # 4 19490119  386 1949 01 19
    # 5 19490202   38 1949 02 02
    
    0 讨论(0)
  • 2020-12-06 16:27

    Another option would be to use separate from the tidyr package:

    library(tidyr)
    separate(df, DATE, c('year','month','day'), sep = c(4,6), remove = FALSE)
    

    which results in:

           DATE year month day PRCP
    1: 19490101 1949    01  01   25
    2: 19490102 1949    01  02    5
    3: 19490118 1949    01  18   18
    4: 19490119 1949    01  19  386
    5: 19490202 1949    02  02   38
    

    Two options in base R:

    1) with substr as said by @coffeinjunky in the comments:

    df$year <- substr(df$DATE,1,4)
    df$month <- substr(df$DATE,5,6)
    df$day <- substr(df$DATE,7,8)
    

    2) with as.Date and format:

    df$DATE <- as.Date(as.character(df$DATE),'%Y%m%d')
    df$year <- format(df$DATE, '%Y')
    df$month <- format(df$DATE, '%m')
    df$day <- format(df$DATE, '%d')
    
    0 讨论(0)
提交回复
热议问题