How to select the row with the maximum value in each group

后端 未结 16 1986
北荒
北荒 2020-11-21 04:18

In a dataset with multiple observations for each subject I want to take a subset with only the maximum data value for each record. For example, with a following dataset:

相关标签:
16条回答
  • 2020-11-21 05:03

    In base you can use ave to get max per group and compare this with pt and get a logical vector to subset the data.frame.

    group[group$pt == ave(group$pt, group$Subject, FUN=max),]
    #  Subject pt Event
    #3       1  5     2
    #7       2 17     2
    #9       3  5     2
    

    Or compare it already in the function.

    group[as.logical(ave(group$pt, group$Subject, FUN=function(x) x==max(x))),]
    #group[ave(group$pt, group$Subject, FUN=function(x) x==max(x))==1,] #Variant
    #  Subject pt Event
    #3       1  5     2
    #7       2 17     2
    #9       3  5     2
    
    0 讨论(0)
  • 2020-11-21 05:05

    A dplyr solution:

    library(dplyr)
    ID <- c(1,1,1,2,2,2,2,3,3)
    Value <- c(2,3,5,2,5,8,17,3,5)
    Event <- c(1,1,2,1,2,1,2,2,2)
    group <- data.frame(Subject=ID, pt=Value, Event=Event)
    
    group %>%
        group_by(Subject) %>%
        summarize(max.pt = max(pt))
    

    This yields the following data frame:

      Subject max.pt
    1       1      5
    2       2     17
    3       3      5
    
    0 讨论(0)
  • 2020-11-21 05:08

    I wasn't sure what you wanted to do about the Event column, but if you want to keep that as well, how about

    isIDmax <- with(dd, ave(Value, ID, FUN=function(x) seq_along(x)==which.max(x)))==1
    group[isIDmax, ]
    
    #   ID Value Event
    # 3  1     5     2
    # 7  2    17     2
    # 9  3     5     2
    

    Here we use ave to look at the "Value" column for each "ID". Then we determine which value is the maximal and then turn that into a logical vector we can use to subset the original data.frame.

    0 讨论(0)
  • 2020-11-21 05:09

    If you want the biggest pt value for a subject, you could simply use:

       pt_max = as.data.frame(aggregate(pt~Subject, group, max))
    
    0 讨论(0)
提交回复
热议问题