Selecting multiple columns in a pandas dataframe

后端未结

关注

 19  1766

I have data in different columns but I don\'t know how to extract it to save it in another variable.

index  a   b   c
1      2   3   4
2      3   4   5


                      
              相关标签:


      
      
        
          19条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  执笔经年        
                
              
                            
                2020-11-22 00:08
              
            
            
                                                                       
The different approaches discussed in above responses are based on the assumption that either the user knows column indices to drop or subset on, or the user wishes to subset a dataframe using a range of columns (for instance between 'C' : 'E'). pandas.DataFrame.drop() is certainly an option to subset data based on a list of columns defined by user (though you have to be cautious that you always use copy of dataframe and inplace parameters should not be set to True!!) 

Another option is to use pandas.columns.difference(), which does a set difference on column names, and returns an index type of array containing desired columns. Following is the solution:

df = pd.DataFrame([[2,3,4],[3,4,5]],columns=['a','b','c'],index=[1,2])
columns_for_differencing = ['a']
df1 = df.copy()[df.columns.difference(columns_for_differencing)]
print(df1)


The output would be:
    b   c
1   3   4
2   4   5

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野趣味        
                
              
                            
                2020-11-22 00:10
              
            
            
                                                                       
I realize this question is quite old, but in the latest version of pandas there is an easy way to do exactly this. Column names (which are strings) can be sliced in whatever manner you like.

columns = ['b', 'c']
df1 = pd.DataFrame(df, columns=columns)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  [愿得一人]        
                
              
                            
                2020-11-22 00:11
              
            
            
                                                                       
With pandas, 

wit column names 

dataframe[['column1','column2']]


to select by iloc and specific columns with index number:

dataframe.iloc[:,[1,2]]


with loc column names can be used like

dataframe.loc[:,['column1','column2']]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  囚心锁ツ        
                
              
                            
                2020-11-22 00:12
              
            
            
                                                                       
I found this method to be very useful:

# iloc[row slicing, column slicing]
surveys_df.iloc [0:3, 1:4]


More details can be found here
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  误落风尘        
                
              
                            
                2020-11-22 00:13
              
            
            
                                                                       
You can use pandas.
I create the DataFrame:

    import pandas as pd
    df = pd.DataFrame([[1, 2,5], [5,4, 5], [7,7, 8], [7,6,9]], 
                      index=['Jane', 'Peter','Alex','Ann'],
                      columns=['Test_1', 'Test_2', 'Test_3'])


The DataFrame:

           Test_1  Test_2  Test_3
    Jane        1       2       5
    Peter       5       4       5
    Alex        7       7       8
    Ann         7       6       9


To select 1 or more columns by name:

    df[['Test_1','Test_3']]

           Test_1  Test_3
    Jane        1       5
    Peter       5       5
    Alex        7       8
    Ann         7       9


You can also use:

    df.Test_2


And yo get column Test_2

    Jane     2
    Peter    4
    Alex     7
    Ann      6


You can also select columns and rows from these rows using .loc(). This is called "slicing". Notice that I take from column Test_1to Test_3

    df.loc[:,'Test_1':'Test_3']


The "Slice" is:

            Test_1  Test_2  Test_3
     Jane        1       2       5
     Peter       5       4       5
     Alex        7       7       8
     Ann         7       6       9


And if you just want Peter and Ann from columns Test_1 and Test_3:

    df.loc[['Peter', 'Ann'],['Test_1','Test_3']]


You get:

           Test_1  Test_3
    Peter       5       5
    Ann         7       9

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  鱼传尺愫        
                
              
                            
                2020-11-22 00:15
              
            
            
                                                                       
In [39]: df
Out[39]: 
   index  a  b  c
0      1  2  3  4
1      2  3  4  5

In [40]: df1 = df[['b', 'c']]

In [41]: df1
Out[41]: 
   b  c
0  3  4
1  4  5

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
3
4
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复