Here is a base R method that uses table
to calculate a cross tab, max.col
to find the mode per group, and rep
together with rle
to fill in the mode across groups.
# calculate a cross tab, frequencies by group
myTab <- table(df$a, df$b)
# repeat the mode for each group, as calculated by colnames(myTab)[max.col(myTab)]
# repeating by the number of times the group ID is observed
df$c <- rep(colnames(myTab)[max.col(myTab)], rle(df$a)$length)
df
a b c
1 1 2 2
2 1 2 2
3 1 1 2
4 1 2 2
5 2 3 3
6 2 3 3
7 2 1 3
8 3 1 2
9 3 2 2
Note that this assumes the data has been sorted by group. Also, the default of max.col
is to break ties (mulitple modes) at random. If you want the first or last value to be the mode, you can set this using the ties.method argument.