How to find the statistical mode?

前端未结

关注

 30  1766

In R, mean() and median() are standard functions which do what you\'d expect. mode() tells you the internal storage mode of the objec

相关标签:

30条回答

轮回少年

2020-11-21 07:39
There are multiple solutions provided for this one. I checked the first one and after that wrote my own. Posting it here if it helps anyone:
```
Mode <- function(x){
  y <- data.frame(table(x))
  y[y$Freq == max(y$Freq),1]
}
```
Lets test it with a few example. I am taking the iris data set. Lets test with numeric data
```
> Mode(iris$Sepal.Length)
[1] 5
```
which you can verify is correct.

Now the only non numeric field in the iris dataset(Species) does not have a mode. Let's test with our own example
```
> test <- c("red","red","green","blue","red")
> Mode(test)
[1] red
```
EDIT

As mentioned in the comments, user might want to preserve the input type. In which case the mode function can be modified to:
```
Mode <- function(x){
  y <- data.frame(table(x))
  z <- y[y$Freq == max(y$Freq),1]
  as(as.character(z),class(x))
}
```
The last line of the function simply coerces the final mode value to the type of the original input.
0 讨论(0)
发布评论:

提交评论
- 加载中...

半阙折子戏

2020-11-21 07:41

I've written the following code in order to generate the mode.

MODE <- function(dataframe){
    DF <- as.data.frame(dataframe)

    MODE2 <- function(x){      
        if (is.numeric(x) == FALSE){
            df <- as.data.frame(table(x))  
            df <- df[order(df$Freq), ]         
            m <- max(df$Freq)        
            MODE1 <- as.vector(as.character(subset(df, Freq == m)[, 1]))

            if (sum(df$Freq)/length(df$Freq)==1){
                warning("No Mode: Frequency of all values is 1", call. = FALSE)
            }else{
                return(MODE1)
            }

        }else{ 
            df <- as.data.frame(table(x))  
            df <- df[order(df$Freq), ]         
            m <- max(df$Freq)        
            MODE1 <- as.vector(as.numeric(as.character(subset(df, Freq == m)[, 1])))

            if (sum(df$Freq)/length(df$Freq)==1){
                warning("No Mode: Frequency of all values is 1", call. = FALSE)
            }else{
                return(MODE1)
            }
        }
    }

    return(as.vector(lapply(DF, MODE2)))
}

Let's try it:

MODE(mtcars)
MODE(CO2)
MODE(ToothGrowth)
MODE(InsectSprays)

0 讨论(0)

南旧

2020-11-21 07:41

I case your observations are classes from Real numbers and you expect that the mode to be 2.5 when your observations are 2, 2, 3, and 3 then you could estimate the mode with mode = l1 + i * (f1-f0) / (2f1 - f0 - f2) where l1..lower limit of most frequent class, f1..frequency of most frequent class, f0..frequency of classes before most frequent class, f2..frequency of classes after most frequent class and i..Class interval as given e.g. in 1, 2, 3:

#Small Example
x <- c(2,2,3,3) #Observations
i <- 1          #Class interval

z <- hist(x, breaks = seq(min(x)-1.5*i, max(x)+1.5*i, i), plot=F) #Calculate frequency of classes
mf <- which.max(z$counts)   #index of most frequent class
zc <- z$counts
z$breaks[mf] + i * (zc[mf] - zc[mf-1]) / (2*zc[mf] - zc[mf-1] - zc[mf+1])  #gives you the mode of 2.5


#Larger Example
set.seed(0)
i <- 5          #Class interval
x <- round(rnorm(100,mean=100,sd=10)/i)*i #Observations

z <- hist(x, breaks = seq(min(x)-1.5*i, max(x)+1.5*i, i), plot=F)
mf <- which.max(z$counts)
zc <- z$counts
z$breaks[mf] + i * (zc[mf] - zc[mf-1]) / (2*zc[mf] - zc[mf-1] - zc[mf+1])  #gives you the mode of 99.5

In case you want the most frequent level and you have more than one most frequent level you can get all of them e.g. with:

x <- c(2,2,3,5,5)
names(which(max(table(x))==table(x)))
#"2" "5"

0 讨论(0)

醉话见心

2020-11-21 07:41
Calculating Mode is mostly in case of factor variable then we can use
```
labels(table(HouseVotes84$V1)[as.numeric(labels(max(table(HouseVotes84$V1))))])
```
HouseVotes84 is dataset available in 'mlbench' package.

it will give max label value. it is easier to use by inbuilt functions itself without writing function.
0 讨论(0)
发布评论:

提交评论
- 加载中...

南方客

2020-11-21 07:43

Mode can't be useful in every situations. So the function should address this situation. Try the following function.

Mode <- function(v) {
  # checking unique numbers in the input
  uniqv <- unique(v)
  # frquency of most occured value in the input data
  m1 <- max(tabulate(match(v, uniqv)))
  n <- length(tabulate(match(v, uniqv)))
  # if all elements are same
  same_val_check <- all(diff(v) == 0)
  if(same_val_check == F){
    # frquency of second most occured value in the input data
    m2 <- sort(tabulate(match(v, uniqv)),partial=n-1)[n-1]
    if (m1 != m2) {
      # Returning the most repeated value
      mode <- uniqv[which.max(tabulate(match(v, uniqv)))]
    } else{
      mode <- "Two or more values have same frequency. So mode can't be calculated."
    }
  } else {
    # if all elements are same
    mode <- unique(v)
  }
  return(mode)
}

Output,

x1 <- c(1,2,3,3,3,4,5)
Mode(x1)
# [1] 3

x2 <- c(1,2,3,4,5)
Mode(x2)
# [1] "Two or more varibles have same frequency. So mode can't be calculated."

x3 <- c(1,1,2,3,3,4,5)
Mode(x3)
# [1] "Two or more values have same frequency. So mode can't be calculated."

0 讨论(0)

没有蜡笔的小新

2020-11-21 07:44
There is package modeest which provide estimators of the mode of univariate unimodal (and sometimes multimodal) data and values of the modes of usual probability distributions.
```
mySamples <- c(19, 4, 5, 7, 29, 19, 29, 13, 25, 19)

library(modeest)
mlv(mySamples, method = "mfv")

Mode (most likely value): 19 
Bickel's modal skewness: -0.1 
Call: mlv.default(x = mySamples, method = "mfv")
```
For more information see this page
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 3 4 5 下一页

How to find the statistical mode?

EDIT