How to find the statistical mode?

前端 未结 30 1697
时光取名叫无心
时光取名叫无心 2020-11-21 07:00

In R, mean() and median() are standard functions which do what you\'d expect. mode() tells you the internal storage mode of the objec

30条回答
  •  梦如初夏
    2020-11-21 07:58

    Based on @Chris's function to calculate the mode or related metrics, however using Ken Williams's method to calculate frequencies. This one provides a fix for the case of no modes at all (all elements equally frequent), and some more readable method names.

    Mode <- function(x, method = "one", na.rm = FALSE) {
      x <- unlist(x)
      if (na.rm) {
        x <- x[!is.na(x)]
      }
    
      # Get unique values
      ux <- unique(x)
      n <- length(ux)
    
      # Get frequencies of all unique values
      frequencies <- tabulate(match(x, ux))
      modes <- frequencies == max(frequencies)
    
      # Determine number of modes
      nmodes <- sum(modes)
      nmodes <- ifelse(nmodes==n, 0L, nmodes)
    
      if (method %in% c("one", "mode", "") | is.na(method)) {
        # Return NA if not exactly one mode, else return the mode
        if (nmodes != 1) {
          return(NA)
        } else {
          return(ux[which(modes)])
        }
      } else if (method %in% c("n", "nmodes")) {
        # Return the number of modes
        return(nmodes)
      } else if (method %in% c("all", "modes")) {
        # Return NA if no modes exist, else return all modes
        if (nmodes > 0) {
          return(ux[which(modes)])
        } else {
          return(NA)
        }
      }
      warning("Warning: method not recognised.  Valid methods are 'one'/'mode' [default], 'n'/'nmodes' and 'all'/'modes'")
    }
    

    Since it uses Ken's method to calculate frequencies the performance is also optimised, using AkselA's post I benchmarked some of the previous answers as to show how my function is close to Ken's in performance, with the conditionals for the various ouput options causing only minor overhead:

提交回复
热议问题