Count features for different ids in columns in R in faster way

前端未结

关注

 4  1609

时光取名叫无心 2021-01-27 13:56

I am trying to process a 20 GB data file in R. I have 16 gigs RAM and i7 processor. I am reading the data using :

y<-read.table(file=\"sample.csv\", header =


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   小鲜肉
                                             
                
                
                (楼主)
            
              
              
                2021-01-27 14:32
              

            
            
                        
How about table()? 

> set.seed(5)
> ids <- sample(1:3, 12, TRUE)
> features <- sample(1:4, 12, TRUE)
> cbind(ids, features)
      ids features
 [1,]   1        2
 [2,]   3        3
 [3,]   3        2
 [4,]   1        1
 [5,]   1        2
 [6,]   3        4
 [7,]   2        3
 [8,]   3        4
 [9,]   3        4
[10,]   1        3
[11,]   1        1
[12,]   2        1

> table(ids, features)
   features
ids 1 2 3 4
  1 2 2 1 0
  2 1 0 1 0
  3 0 1 1 3


So for example feature 4 appears 3 times in id 3.

EDIT: You can use as.data.frame() to "flatten" the table and get:

> as.data.frame(table(ids, features))
   ids features Freq
1    1        1    2
2    2        1    1
3    3        1    0
4    1        2    2
5    2        2    0
6    3        2    1
7    1        3    1
8    2        3    1
9    3        3    1
10   1        4    0
11   2        4    0
12   3        4    3

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复