Does anybody know how to aggregate by NA in R.
If you take the example below
a <- matrix(1,5,2)
a[1:2,2] <- NA
a[3:5,2] <- 2
aggregate(a[,1]
The addNA
solution of Rich doesn't require any substantial change to the aggregate
syntax, so I think it's the best solution. I'll point out that another option, which produces output similar to table
(and thus can be coerced into a data.frame
structure similar to that of aggregate
) is xtabs
.
xtabs(a[, 1] ~ a[, 2], addNA=T)
Gives:
Group.1 x
1 2 3
2 2
Another "trick" I see is assigning a missing code to these data. We all like the NA
output of R, but assigning a missing code to a grouping variable is a good coding exercise. We take it so that it has one more digit than the largest value in the dataset and is of the form -999...99.
codemiss <- function(x) -10^(floor(log(max(abs(x), na.rm=T), base=10))+2)-1
works in general.
Then you get
a[, 2][is.na(a[, 2])] <- codemiss(a[, 2])
And:
aggregate(a[, 1], list(a[, 2]), sum)
Gives you:
Group.1 x
1 -99 2
2 2 3