Earliest Date for each id in R

前端 未结 4 1689
余生分开走
余生分开走 2020-12-31 18:54

I have a dataset where each individual (id) has an e_date, and since each individual could have more than one e_date, I\'m trying to get the earli

相关标签:
4条回答
  • 2020-12-31 19:14

    We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(data_full)), grouped by 'id', we get the 1st row (head(.SD, 1L)).

    library(data.table)
    setDT(data_full)[order(e_date), head(.SD, 1L), by = id]
    

    Or using dplyr, after grouping by 'id', arrange the 'e_date' (assuming it is of Date class) and get the first row with slice.

    library(dplyr)
    data_full %>%
        group_by(id) %>%
        arrange(e_date) %>%
        slice(1L)
    

    If we need a base R option, ave can be used

    data_full[with(data_full, ave(e_date, id, FUN = function(x) rank(x)==1)),]
    
    0 讨论(0)
  • 2020-12-31 19:25

    Another answer that uses dplyr's filter command:

    dta %>% 
      group_by(id) %>%
      filter(date == min(date))
    
    0 讨论(0)
  • 2020-12-31 19:28

    I made a reproducible example, supposing that you grouped some dates by which quarter they were in.

    library(lubridate)
    library(dplyr)
    rand_weeks <- now() + weeks(sample(100))
    which_quarter <- quarter(rand_weeks)
    df <- data.frame(rand_weeks, which_quarter)
    
    df %>%
      group_by(which_quarter) %>% summarise(sort(rand_weeks)[1])
    
    # A tibble: 4 x 2
      which_quarter sort(rand_weeks)[1]
              <dbl>              <time>
    1             1 2017-01-05 05:46:32
    2             2 2017-04-06 05:46:32
    3             3 2016-08-18 05:46:32
    4             4 2016-10-06 05:46:32
    
    0 讨论(0)
  • 2020-12-31 19:28

    You may use library(sqldf) to get the minimum date as follows:

    data1<-data.frame(id=c("789","123","456","123","123","456","789"),
                      e_date=c("2016-05-01","2016-07-02","2016-08-25","2015-12-11","2014-03-01","2015-07-08","2015-12-11"))  
    
    library(sqldf)
    data2 = sqldf("SELECT id,
                        min(e_date) as 'earliest_date'
                        FROM data1 GROUP BY 1", method = "name__class")    
    
    head(data2)   
    

    id earliest_date
    123 2014-03-01
    456 2015-07-08
    789 2015-12-11

    0 讨论(0)
提交回复
热议问题