Filter pandas dataframe with specific column names in python

后端未结

关注

 2  1415

I have a pandas dataframe and a list as follows

mylist = [\'nnn\', \'mmm\', \'yyy\']
mydata =
   xxx   yyy zzz nnn ffffd mmm
0  0  10      5    5   5  5
1  1   9


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  醉梦人生        
                
              
                            
                2021-02-13 03:21
              
            
            
                                                                       
Just pass a list of column names to index df:

df[['nnn', 'mmm', 'yyy']]

   nnn  mmm  yyy
0    5    5   10
1    3    4    9
2    7    0    8




If you need to handle non-existent column names in your list, try filtering with df.columns.isin - 

df.loc[:, df.columns.isin(['nnn', 'mmm', 'yyy', 'zzzzzz'])]

   yyy  nnn  mmm
0   10    5    5
1    9    3    4
2    8    7    0

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  不知归路        
                
              
                            
                2021-02-13 03:31
              
            
            
                                                                       
You can just put mylist inside [] and pandas will select it for you. 

mydata_new = mydata[mylist]


Not sure whether your yyy is a typo. 

The reason that you are wrong is that you are assigning mydata_new to a new series every time in the loop.

for item in mylist:
    mydata_new = mydata[item]  # <-  


Thus, it will create a series rather than the whole df you want.



If some names in the list is not in your data frame, you can always check it with, 

len(set(mylist) - set(mydata.columns)) > 0


and print it out 

print(set(mylist) - set(mydata.columns))


Then see if there are typos or other unintended behaviors.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复