How to find mode across variables/vectors within a data row in R

后端 未结 3 631
谎友^
谎友^ 2021-01-20 09:55

Does anyone know how to find the mode (most frequent across variables for a single case in R?

For example, if I had data on favorite type of fruit (x), asked nine t

相关标签:
3条回答
  • 2021-01-20 10:14

    The modeest package provides implements a number of estimators of the mode for unimodal univariate data.

    This has a function mfv to return the most frequent value, or (as ?mfv states) it is perhaps better to use `mlv(..., method = 'discrete')

    library(modeest)
    
    
    ## assuming your data is in the data.frame dd
    
    apply(dd[,2:6], 1,mfv)
    [1] 5 7 4 2
    ## or
    apply(dd[,2:6], 1,mlv, method = 'discrete')
    [[1]]
    Mode (most frequent value): 5 
    Bickel's modal skewness: -0.2 
    Call: mlv.integer(x = newX[, i], method = "discrete") 
    
    [[2]]
    Mode (most frequent value): 7 
    Bickel's modal skewness: -0.4 
    Call: mlv.integer(x = newX[, i], method = "discrete") 
    
    [[3]]
    Mode (most frequent value): 4 
    Bickel's modal skewness: -0.4 
    Call: mlv.integer(x = newX[, i], method = "discrete") 
    
    [[4]]
    Mode (most frequent value): 2 
    Bickel's modal skewness: 0.4 
    Call: mlv.integer(x = newX[, i], method = "discrete") 
    

    Now, if you have ties for the most frequent, then you need to think about what you want.
    both mfv and mlv.integer will return all the values that tie for the most frequent. (although the print method only shows a single value)

    0 讨论(0)
  • 2021-01-20 10:22

    A solution that chooses the lowest value for ties is given by:

    modeStat = function(vals) {
      return(as.numeric(names(which.max(table(vals)))))
    }
    modeStat(c(1,3,5,6,4,5))
    

    This returns:

    [1] 5
    
    0 讨论(0)
  • 2021-01-20 10:24

    Using mean on ties, and returning a vector:

    > x[-7]
    ##   x v1 v2 v3 v4 v5
    ## 1 1  3  4  5  4  5
    ## 2 2  7  4  7  4  7
    ## 3 3  3  4  4  4  3
    ## 4 4  3  2  2  2  3
    

    This is not quite the same data as in your question. The first row has been altered to introduce a tie.

    require(functional)
    apply(x[2:6], 1, Compose(table,
                             function(i) i==max(i),
                             which,
                             names,
                             as.numeric,
                             mean))
    
    ## [1] 4.5 7.0 4.0 2.0
    

    Replace mean with whatever tie-breaking function that you need.

    0 讨论(0)
提交回复
热议问题