Faster %in% operator

前端 未结 2 625
太阳男子
太阳男子 2021-02-07 00:55

The fastmatch package implements a much faster version of match for repeated matches (e.g. in a loop):

set.seed(1)
library(fastmatch)
table <- 1L         


        
2条回答
  •  悲&欢浪女
    2021-02-07 01:35

    match is almost always better done by putting both vectors in dataframes and merging (see various joins from dplyr)

    For example, something like this would give you all the info you need:

    library(dplyr)
    
    data = data_frame(data.ID = 1L:100000L,
                      data.extra = 1:2)
    
    sample = 
      data %>% 
      sample_n(10000, replace=TRUE) %>%
      mutate(sample.ID = 1:n(),
             sample.extra = 3:4 )
    
    # join table not strictly necessary in this case
    # but necessary in many-to-many matches
    data__sample = inner_join(data, sample)
    
    #check whether a data.ID made it into sample
    data__sample %>% filter(data.ID == 1)
    

    or left_join, right_join, full_join, semi_join, anti_join, depending on what info is most useful to you

提交回复
热议问题