I have data in a similar structure as shown below. This is in R code but if you can just write the query without the R stuff thats fine too.
I have multiple groups and
I don't know R but your SQL would be something like:
SELECT * FROM YourTable as A
INNER JOIN (SELECT GROUPS, MAX(DATES) AS MAX_DATE FROM YourTable GROUP BY GROUPS) AS B
ON A.GROUPS = B.GROUPS AND B.MAX_DATE = A.DATES
This would identify the max date for each group (derived table B) then match them with the records from the main table (table A).
It's not clear what solution you want R or SQL, so here are both.
First, I'm assuming your dates
column is of class Date
as in
Df$dates <- as.Date(Df$dates)
SQL
Using the sqldf
package you basically have two simple solutions, either explicitly select the columns where dates
is maximum
sqldf('select max(dates) as dates, "group", value from Df group by "group"')
# dates group value
# 1 2012-08-20 a 2
# 2 2013-07-31 b 3
Or you can select all the columns
sqldf('select * from Df where dates in (select max(dates) from Df group by "group")')
# dates group value
# 1 2012-08-20 a 2
# 2 2013-07-31 b 3
R
So in R there could many possible solutions
library(data.table)
setDT(Df)[, .SD[which.max(dates)], by = group]
# group dates value
# 1: a 2012-08-20 2
# 2: b 2013-07-31 3
Or
library(dplyr)
Df %>%
group_by(group) %>%
filter(dates == max(dates))
# Source: local data table [2 x 3]
# Groups: group
#
# dates group value
# 1 2012-08-20 a 2
# 2 2013-07-31 b 3
Or
do.call(rbind, by(Df, Df$group, function(x) x[which.max(x$dates), ]))
# dates group value
# 1: 2012-08-20 a 2
# 2: 2013-07-31 b 3