Does anyone know how to find the mode (most frequent across variables for a single case in R?
For example, if I had data on favorite type of fruit (x), asked nine t
The modeest package provides implements a number of estimators of the mode for unimodal univariate data.
This has a function mfv
to return the most frequent value, or (as ?mfv
states) it is perhaps better to use `mlv(..., method = 'discrete')
library(modeest)
## assuming your data is in the data.frame dd
apply(dd[,2:6], 1,mfv)
[1] 5 7 4 2
## or
apply(dd[,2:6], 1,mlv, method = 'discrete')
[[1]]
Mode (most frequent value): 5
Bickel's modal skewness: -0.2
Call: mlv.integer(x = newX[, i], method = "discrete")
[[2]]
Mode (most frequent value): 7
Bickel's modal skewness: -0.4
Call: mlv.integer(x = newX[, i], method = "discrete")
[[3]]
Mode (most frequent value): 4
Bickel's modal skewness: -0.4
Call: mlv.integer(x = newX[, i], method = "discrete")
[[4]]
Mode (most frequent value): 2
Bickel's modal skewness: 0.4
Call: mlv.integer(x = newX[, i], method = "discrete")
Now, if you have ties for the most frequent, then you need to think about what you want.
both mfv
and mlv.integer
will return all the values that tie for the most frequent. (although the print method only shows a single value)
A solution that chooses the lowest value for ties is given by:
modeStat = function(vals) {
return(as.numeric(names(which.max(table(vals)))))
}
modeStat(c(1,3,5,6,4,5))
This returns:
[1] 5
Using mean
on ties, and returning a vector:
> x[-7]
## x v1 v2 v3 v4 v5
## 1 1 3 4 5 4 5
## 2 2 7 4 7 4 7
## 3 3 3 4 4 4 3
## 4 4 3 2 2 2 3
This is not quite the same data as in your question. The first row has been altered to introduce a tie.
require(functional)
apply(x[2:6], 1, Compose(table,
function(i) i==max(i),
which,
names,
as.numeric,
mean))
## [1] 4.5 7.0 4.0 2.0
Replace mean
with whatever tie-breaking function that you need.