I have a dataframe consisting of an ID
, that is the same for each element in a group, two datetimes and the time interval between these two. One of the datetime
As you don't provide any data, here is an example using base R with a sample data frame :
df <- data.frame(group=c("a", "b"), value=1:8)
## Order the data frame with the variable of interest
df <- df[order(df$value),]
## Aggregate
aggregate(df, list(df$group), FUN=head, 1)
EDIT : As Ananda suggests in his comment, the following call to aggregate
is better :
aggregate(.~group, df, FUN=head, 1)
If you prefer to use plyr
, you can replace aggregate
with ddply
:
ddply(df, "group", head, 1)
By reproducing the example data frame and testing it I found a way of getting the needed result:
Order data by relevant columns (ID, Start)
ordered_data <- data[order(data$ID, data$Start),]
Find the first row for each new ID
final <- ordered_data[!duplicated(ordered_data$ID),]