Drop data frame columns by name

后端未结

关注

 20  2621

I have a number of columns that I would like to remove from a data frame. I know that we can delete them individually using something like:

df$x <- NULL
<


                      
              相关标签:


      
      
        
          20条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  一生所求        
                
              
                            
                2020-11-22 01:42
              
            
            
                                                                       
within(df, rm(x))


is probably easiest, or for multiple variables:

within(df, rm(x, y))


Or if you're dealing with data.tables (per How do you delete a column by name in data.table?):

dt[, x := NULL]   # Deletes column x by reference instantly.

dt[, !"x"]   # Selects all but x into a new data.table.


or for multiple variables

dt[, c("x","y") := NULL]

dt[, !c("x", "y")]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  遥遥无期        
                
              
                            
                2020-11-22 01:42
              
            
            
                                                                       
You could use %in% like this:

df[, !(colnames(df) %in% c("x","bar","foo"))]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  别跟我提以往        
                
              
                            
                2020-11-22 01:46
              
            
            
                                                                       
There's a function called dropNamed() in Bernd Bischl's BBmisc package that does exactly this.

BBmisc::dropNamed(df, "x")


The advantage is that it avoids repeating the data frame argument and thus is suitable for piping in magrittr (just like the dplyr approaches):

df %>% BBmisc::dropNamed("x")

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  無奈伤痛        
                
              
                            
                2020-11-22 01:48
              
            
            
                                                                       
Here is a dplyr way to go about it:

#df[ -c(1,3:6, 12) ]  # original
df.cut <- df %>% select(-col.to.drop.1, -col.to.drop.2, ..., -col.to.drop.6)  # with dplyr::select()


I like this because it's intuitive to read & understand without annotation and robust to columns changing position within the data frame. It also follows the vectorized idiom using - to remove elements.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一整个雨季        
                
              
                            
                2020-11-22 01:50
              
            
            
                                                                       
DF <- data.frame(
  x=1:10,
  y=10:1,
  z=rep(5,10),
  a=11:20
)
DF


Output: 

    x  y z  a
1   1 10 5 11
2   2  9 5 12
3   3  8 5 13
4   4  7 5 14
5   5  6 5 15
6   6  5 5 16
7   7  4 5 17
8   8  3 5 18
9   9  2 5 19
10 10  1 5 20




DF[c("a","x")] <- list(NULL)


Output:

        y z
    1  10 5
    2   9 5
    3   8 5
    4   7 5
    5   6 5
    6   5 5
    7   4 5
    8   3 5    
    9   2 5
    10  1 5

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  失恋的感觉        
                
              
                            
                2020-11-22 01:51
              
            
            
                                                                       
You can use a simple list of names :

DF <- data.frame(
  x=1:10,
  y=10:1,
  z=rep(5,10),
  a=11:20
)
drops <- c("x","z")
DF[ , !(names(DF) %in% drops)]


Or, alternatively, you can make a list of those to keep and refer to them by name :

keeps <- c("y", "a")
DF[keeps]


EDIT :
For those still not acquainted with the drop argument of the indexing function, if you want to keep one column as a data frame, you do:

keeps <- "y"
DF[ , keeps, drop = FALSE]


drop=TRUE (or not mentioning it) will drop unnecessary dimensions, and hence return a vector with the values of column y.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
3
4
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复