select observation by group with max date SQL via R

前端 未结 2 514
梦毁少年i
梦毁少年i 2021-01-29 11:44

I have data in a similar structure as shown below. This is in R code but if you can just write the query without the R stuff thats fine too.

I have multiple groups and

相关标签:
2条回答
  • 2021-01-29 12:04

    I don't know R but your SQL would be something like:

    SELECT * FROM YourTable as A 
    INNER JOIN (SELECT GROUPS, MAX(DATES) AS MAX_DATE FROM YourTable GROUP BY GROUPS) AS B 
    ON A.GROUPS = B.GROUPS AND B.MAX_DATE = A.DATES
    

    This would identify the max date for each group (derived table B) then match them with the records from the main table (table A).

    0 讨论(0)
  • 2021-01-29 12:16

    It's not clear what solution you want R or SQL, so here are both. First, I'm assuming your dates column is of class Date as in

    Df$dates <- as.Date(Df$dates)
    

    SQL

    Using the sqldf package you basically have two simple solutions, either explicitly select the columns where dates is maximum

    sqldf('select max(dates) as dates, "group", value from Df group by "group"')
    #        dates group value
    # 1 2012-08-20     a     2
    # 2 2013-07-31     b     3
    

    Or you can select all the columns

    sqldf('select * from Df where dates in (select max(dates) from Df group by "group")')
    #        dates group value
    # 1 2012-08-20     a     2
    # 2 2013-07-31     b     3
    

    R

    So in R there could many possible solutions

    library(data.table)
    setDT(Df)[, .SD[which.max(dates)], by = group]
    #    group      dates value
    # 1:     a 2012-08-20     2
    # 2:     b 2013-07-31     3
    

    Or

    library(dplyr)
    Df %>%
      group_by(group) %>%
      filter(dates == max(dates))
    
    # Source: local data table [2 x 3]
    # Groups: group
    # 
    #        dates group value
    # 1 2012-08-20     a     2
    # 2 2013-07-31     b     3
    

    Or

    do.call(rbind, by(Df, Df$group, function(x) x[which.max(x$dates), ]))
    #         dates group value
    # 1: 2012-08-20     a     2
    # 2: 2013-07-31     b     3
    
    0 讨论(0)
提交回复
热议问题