“longer object length is not a multiple of shorter object length”

后端 未结 3 1088
梦谈多话
梦谈多话 2021-01-21 16:05

I have this dataset -

print(df)

  object    group   
1 apple      A    
1 banana     B    
1 pear       A    
1 robot      C

print(df2)

  object    group   
         


        
相关标签:
3条回答
  • 2021-01-21 16:42

    If you need a tidyverse option, we can use map_dbl

    purrr::map_dbl(df$object, ~ length(df2[df2$object == .,]$object))
    #[1] 3 1 0 3
    

    which can be also calculated with sum

    purrr::map_dbl(df$object, ~ sum(df2$object == .))
    

    So in mutate we can add

    df %>%
      mutate(reference = map_dbl(object,  ~ sum(df2$object == .)))
    
    #  object group reference
    #1  apple     A         3
    #2 banana     B         1
    #3   pear     A         0
    #4  robot     C         3
    

    The similar base R option is sapply

    sapply(df$object, function(x) sum(df2$object == x))
    
    # apple banana   pear  robot 
    #     3      1      0      3 
    
    0 讨论(0)
  • 2021-01-21 16:46

    We can do this in data.table

    library(data.table)
    reference <- setDT(df2)[df, .N, on = .(object), by = .EACHI]$N
    df$reference <- reference
    df
    #   object group reference
    #1:  apple     A         3
    #2: banana     B         1
    #3:   pear     A         0
    #4:  robot     C         3
    
    0 讨论(0)
  • 2021-01-21 16:53

    From my comment: dplyr functions work on the whole column taken as a vector. Try

    df %>%
    rowwise() %>% 
    mutate(reference = length(df2[df2$object == object,]$object))%>%
    ungroup()
    

    As you said, ungroup will be needed, unless you plan on doing further row-wise operations.

    0 讨论(0)
提交回复
热议问题