I have the following dataframe
one <- c(\'one\',NA,NA,NA,NA,\'two\',NA,NA)
group1 <- c(\'A\',\'A\',\'A\',\'A\',\'B\',\'B\',\'B\',\'B\')
group2 <- c(\'C\
Let's not forget that a lot of things can be done in base
R, although sometimes not as efficiently as data.table
or dplyr
:
df$count<-ave(as.integer(df$one),df[,2:3],FUN=function(x) sum(!is.na(x)))
# one group1 group2 count
#1 one A C 1
#2 <NA> A C 1
#3 <NA> A C 1
#4 <NA> A D 0
#5 <NA> B E 1
#6 two B E 1
#7 <NA> B F 0
#8 <NA> B F 0
library(dplyr)
df %>% group_by(group1, group2) %>% mutate(count = sum(!is.na(one)))
Source: local data frame [8 x 4] Groups: group1, group2 [4] one group1 group2 count <fctr> <fctr> <fctr> <int> 1 one A C 1 2 NA A C 1 3 NA A C 1 4 NA A D 0 5 NA B E 1 6 two B E 1 7 NA B F 0 8 NA B F 0
with data.table
:
setDT(df)
df[,count_B:=sum(!is.na(one)),by=c("group1","group2")]
gives:
one group1 group2 count_B
1: one A C 1
2: NA A C 1
3: NA A C 1
4: NA A D 0
5: NA B E 1
6: two B E 1
7: NA B F 0
8: NA B F 0
The idea is to sum the true values (1 once converted to integer) where B is not NA
while grouping by group1
and group2
.