select observation by group with max date SQL via R

前端 未结 2 507
梦毁少年i
梦毁少年i 2021-01-29 11:44

I have data in a similar structure as shown below. This is in R code but if you can just write the query without the R stuff thats fine too.

I have multiple groups and

2条回答
  •  鱼传尺愫
    2021-01-29 12:16

    It's not clear what solution you want R or SQL, so here are both. First, I'm assuming your dates column is of class Date as in

    Df$dates <- as.Date(Df$dates)
    

    SQL

    Using the sqldf package you basically have two simple solutions, either explicitly select the columns where dates is maximum

    sqldf('select max(dates) as dates, "group", value from Df group by "group"')
    #        dates group value
    # 1 2012-08-20     a     2
    # 2 2013-07-31     b     3
    

    Or you can select all the columns

    sqldf('select * from Df where dates in (select max(dates) from Df group by "group")')
    #        dates group value
    # 1 2012-08-20     a     2
    # 2 2013-07-31     b     3
    

    R

    So in R there could many possible solutions

    library(data.table)
    setDT(Df)[, .SD[which.max(dates)], by = group]
    #    group      dates value
    # 1:     a 2012-08-20     2
    # 2:     b 2013-07-31     3
    

    Or

    library(dplyr)
    Df %>%
      group_by(group) %>%
      filter(dates == max(dates))
    
    # Source: local data table [2 x 3]
    # Groups: group
    # 
    #        dates group value
    # 1 2012-08-20     a     2
    # 2 2013-07-31     b     3
    

    Or

    do.call(rbind, by(Df, Df$group, function(x) x[which.max(x$dates), ]))
    #         dates group value
    # 1: 2012-08-20     a     2
    # 2: 2013-07-31     b     3
    

提交回复
热议问题