I would like calculate the most frequent factor level by category with plyr using the code below. The data frame b
shows the requested result. Why does
You have pretty much exclusively used existing function names in your example: levels
, cat
, and mode
. Generally, that doesn't create much of a problem--for example, calling a data.frame "df" doesn't break R's df()
function. But it almost always leads to more ambiguous or confusing code, and in this case, it made things "break". Arun's answer does a great job of showing why.
You can easily fix your problem by renaming your "mode" function. In the example below, I've simplified it a little bit in addition to renaming it, and it works as you expected.
Mode <- function(x) names(which.max(table(x)))
ddply(a, .(cat), summarise,
mlevels=Mode(levels))
# cat mlevels
# 1 1 6
# 2 2 5
# 3 3 9
Of course, there's a really cumbersome workaround: Use get
and specify where to search for the function.
> mode <- function(x) names(table(x))[which.max(table(x))]
> ddply(a, .(cat), summarise, mlevels = get("mode", ".GlobalEnv")(levels))
cat mlevels
1 1 6
2 2 5
3 3 9