I have this dataset -
print(df)
object group
1 apple A
1 banana B
1 pear A
1 robot C
print(df2)
object group
If you need a tidyverse
option, we can use map_dbl
purrr::map_dbl(df$object, ~ length(df2[df2$object == .,]$object))
#[1] 3 1 0 3
which can be also calculated with sum
purrr::map_dbl(df$object, ~ sum(df2$object == .))
So in mutate
we can add
df %>%
mutate(reference = map_dbl(object, ~ sum(df2$object == .)))
# object group reference
#1 apple A 3
#2 banana B 1
#3 pear A 0
#4 robot C 3
The similar base R option is sapply
sapply(df$object, function(x) sum(df2$object == x))
# apple banana pear robot
# 3 1 0 3
We can do this in data.table
library(data.table)
reference <- setDT(df2)[df, .N, on = .(object), by = .EACHI]$N
df$reference <- reference
df
# object group reference
#1: apple A 3
#2: banana B 1
#3: pear A 0
#4: robot C 3
From my comment: dplyr functions work on the whole column taken as a vector. Try
df %>%
rowwise() %>%
mutate(reference = length(df2[df2$object == object,]$object))%>%
ungroup()
As you said, ungroup
will be needed, unless you plan on doing further row-wise operations.