Unique combination of all elements from two (or more) vectors

后端未结

关注

 5  1683

I am trying to create a unique combination of all elements from two vectors of different size in R.

For example, the first vector is

a <- c(\"ABC\


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  情书的邮戳        
                
              
                            
                2020-11-22 03:18
              
            
            
                                                                       
you can use order function for sorting any number of columns. for your example 

df <- expand.grid(a,b)
> df
   Var1       Var2
1   ABC 2012-05-01
2   DEF 2012-05-01
3   GHI 2012-05-01
4   ABC 2012-05-02
5   DEF 2012-05-02
6   GHI 2012-05-02
7   ABC 2012-05-03
8   DEF 2012-05-03
9   GHI 2012-05-03
10  ABC 2012-05-04
11  DEF 2012-05-04
12  GHI 2012-05-04
13  ABC 2012-05-05
14  DEF 2012-05-05
15  GHI 2012-05-05

> df[order( df[,1], df[,2] ),] 
   Var1       Var2
1   ABC 2012-05-01
4   ABC 2012-05-02
7   ABC 2012-05-03
10  ABC 2012-05-04
13  ABC 2012-05-05
2   DEF 2012-05-01
5   DEF 2012-05-02
8   DEF 2012-05-03
11  DEF 2012-05-04
14  DEF 2012-05-05
3   GHI 2012-05-01
6   GHI 2012-05-02
9   GHI 2012-05-03
12  GHI 2012-05-04
15  GHI 2012-05-05`

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  渐次进展        
                
              
                            
                2020-11-22 03:22
              
            
            
                                                                       
The tidyr package provides the nice alternative crossing, which works better than the classic expand.grid function because (1) strings are not converted into factors and (2) the sorting is more intuitive:

library(tidyr)

a <- c("ABC", "DEF", "GHI")
b <- c("2012-05-01", "2012-05-02", "2012-05-03", "2012-05-04", "2012-05-05")

crossing(a, b)

# A tibble: 15 x 2
       a          b
   <chr>      <chr>
 1   ABC 2012-05-01
 2   ABC 2012-05-02
 3   ABC 2012-05-03
 4   ABC 2012-05-04
 5   ABC 2012-05-05
 6   DEF 2012-05-01
 7   DEF 2012-05-02
 8   DEF 2012-05-03
 9   DEF 2012-05-04
10   DEF 2012-05-05
11   GHI 2012-05-01
12   GHI 2012-05-02
13   GHI 2012-05-03
14   GHI 2012-05-04
15   GHI 2012-05-05

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  时光说笑        
                
              
                            
                2020-11-22 03:27
              
            
            
                                                                       
this maybe what you are after

> expand.grid(a,b)
   Var1       Var2
1   ABC 2012-05-01
2   DEF 2012-05-01
3   GHI 2012-05-01
4   ABC 2012-05-02
5   DEF 2012-05-02
6   GHI 2012-05-02
7   ABC 2012-05-03
8   DEF 2012-05-03
9   GHI 2012-05-03
10  ABC 2012-05-04
11  DEF 2012-05-04
12  GHI 2012-05-04
13  ABC 2012-05-05
14  DEF 2012-05-05
15  GHI 2012-05-05


If the resulting order isn't what you want, you can sort afterwards. If you name the arguments to expand.grid, they will become column names:

df = expand.grid(a = a, b = b)
df[order(df$a), ]


And expand.grid generalizes to any number of input columns.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  清歌不尽        
                
              
                            
                2020-11-22 03:31
              
            
            
                                                                       
Missing in this r-faq overview is the CJ-function from the data.table-package. Using:

library(data.table)
CJ(a, b, unique = TRUE)


gives:


      a          b
 1: ABC 2012-05-01
 2: ABC 2012-05-02
 3: ABC 2012-05-03
 4: ABC 2012-05-04
 5: ABC 2012-05-05
 6: DEF 2012-05-01
 7: DEF 2012-05-02
 8: DEF 2012-05-03
 9: DEF 2012-05-04
10: DEF 2012-05-05
11: GHI 2012-05-01
12: GHI 2012-05-02
13: GHI 2012-05-03
14: GHI 2012-05-04
15: GHI 2012-05-05





_{NOTE: since version 1.12.2 CJ autonames the resulting columns (see also here and here).}
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  故里飘歌        
                
              
                            
                2020-11-22 03:31
              
            
            
                                                                       
Since version 1.0.0, tidyr offers its own version of expand.grid(). It completes the existing family of expand(), nesting(), and crossing() with a low-level function that works with vectors. 

When compared to base::expand.grid():


  Varies the first element fastest. Never converts strings to factors.
  Does not add any additional attributes. Returns a tibble, not a data
  frame. Can expand any generalised vector, including data frames.


a <- c("ABC", "DEF", "GHI")
b <- c("2012-05-01", "2012-05-02", "2012-05-03", "2012-05-04", "2012-05-05")

tidyr::expand_grid(a, b)

   a     b         
   <chr> <chr>     
 1 ABC   2012-05-01
 2 ABC   2012-05-02
 3 ABC   2012-05-03
 4 ABC   2012-05-04
 5 ABC   2012-05-05
 6 DEF   2012-05-01
 7 DEF   2012-05-02
 8 DEF   2012-05-03
 9 DEF   2012-05-04
10 DEF   2012-05-05
11 GHI   2012-05-01
12 GHI   2012-05-02
13 GHI   2012-05-03
14 GHI   2012-05-04
15 GHI   2012-05-05

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复