Extract elements common in all column groups

前端未结

关注

 2  1755

I have a R dataset x as below:

  ID Month
1   1   Jan
2   3   Jan
3   4   Jan
4   6   Jan
5   6   Jan
6   9   Jan
7   2   Feb
8   4   Feb
9   6   Feb
10  8


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  北恋        
                
              
                            
                2020-11-28 16:41
              
            
            
                                                                       
First, split the df$ID by Month and use intersect to find elements common in each sub-group.

Reduce(intersect, split(df$ID, df$Month))
#[1] 4 6


If you want to subset the corresponding data.frame, do

df[df$ID %in% Reduce(intersect, split(df$ID, df$Month)),]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野趣味        
                
              
                            
                2020-11-28 16:57
              
            
            
                                                                       
We can use data.table.  Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'ID', get the row index (.I) where the number of unique 'Months' are equal to the number of unique 'Months' in the whole dataset and subset the data based on this

library(data.table)
setDT(df1)[df1[, .I[uniqueN(Month) == uniqueN(df1$Month)], ID]$V1]
#    ID Month
# 1:  4   Jan
# 2:  4   Feb
# 3:  4   Mar
# 4:  4   Apr
# 5:  4   May
# 6:  4   Jun
# 7:  6   Jan
# 8:  6   Jan
# 9:  6   Feb
#10:  6   Mar
#11:  6   Apr
#12:  6   May
#13:  6   Jun


To extract the 'ID's

setDT(df1)[, ID[uniqueN(Month) == uniqueN(df1$Month)], ID]$V1
#[1] 4 6




Or with base R 

1) Using table with rowSums

v1 <- rowSums(table(df1) > 0)
names(v1)[v1==max(v1)]
#[1] "4" "6"


This info can be used for subsetting the data

subset(df1, ID %in% names(v1)[v1 == max(v1)])


2) Using tapply

lst <- with(df1, tapply(Month, ID, FUN = unique))
names(which(lengths(lst) == length(unique(df1$Month))))
#[1] "4" "6"




Or using dplyr

library(dplyr)
df1 %>%
     group_by(ID) %>%
     filter(n_distinct(Month)== n_distinct(df1$Month)) %>%
     .$ID %>%
     unique
#[1] 4 6


or if we need to get the rows

df1 %>%
     group_by(ID) %>%
     filter(n_distinct(Month)== n_distinct(df1$Month))
# A tibble: 13 x 2
# Groups:   ID [2]
#      ID Month
#   <int> <chr>
# 1     4   Jan
# 2     6   Jan
# 3     6   Jan
# 4     4   Feb
# 5     6   Feb
# 6     4   Mar
# 7     6   Mar
# 8     4   Apr
# 9     6   Apr
#10     4   May
#11     6   May
#12     4   Jun
#13     6   Jun

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复