How to change the order of DataFrame columns?

前端 未结 30 1554
南旧
南旧 2020-11-22 01:24

I have the following DataFrame (df):

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(10, 5))
相关标签:
30条回答
  • 2020-11-22 02:26

    From August 2018:

    If your column names are too long to type then you could specify the new order through a list of integers with the positions:

    Data:

              0         1         2         3         4      mean
    0  0.397312  0.361846  0.719802  0.575223  0.449205  0.500678
    1  0.287256  0.522337  0.992154  0.584221  0.042739  0.485741
    2  0.884812  0.464172  0.149296  0.167698  0.793634  0.491923
    3  0.656891  0.500179  0.046006  0.862769  0.651065  0.543382
    4  0.673702  0.223489  0.438760  0.468954  0.308509  0.422683
    5  0.764020  0.093050  0.100932  0.572475  0.416471  0.389390
    6  0.259181  0.248186  0.626101  0.556980  0.559413  0.449972
    7  0.400591  0.075461  0.096072  0.308755  0.157078  0.207592
    8  0.639745  0.368987  0.340573  0.997547  0.011892  0.471749
    9  0.050582  0.714160  0.168839  0.899230  0.359690  0.438500
    

    Generic example:

    new_order = [3,2,1,4,5,0]
    print(df[df.columns[new_order]])  
    
              3         2         1         4      mean         0
    0  0.575223  0.719802  0.361846  0.449205  0.500678  0.397312
    1  0.584221  0.992154  0.522337  0.042739  0.485741  0.287256
    2  0.167698  0.149296  0.464172  0.793634  0.491923  0.884812
    3  0.862769  0.046006  0.500179  0.651065  0.543382  0.656891
    4  0.468954  0.438760  0.223489  0.308509  0.422683  0.673702
    5  0.572475  0.100932  0.093050  0.416471  0.389390  0.764020
    6  0.556980  0.626101  0.248186  0.559413  0.449972  0.259181
    7  0.308755  0.096072  0.075461  0.157078  0.207592  0.400591
    8  0.997547  0.340573  0.368987  0.011892  0.471749  0.639745
    9  0.899230  0.168839  0.714160  0.359690  0.438500  0.050582
    
          
    

    And for the specific case of OP's question:

    new_order = [-1,0,1,2,3,4]
    df = df[df.columns[new_order]]
    print(df)
    
           mean         0         1         2         3         4
    0  0.500678  0.397312  0.361846  0.719802  0.575223  0.449205
    1  0.485741  0.287256  0.522337  0.992154  0.584221  0.042739
    2  0.491923  0.884812  0.464172  0.149296  0.167698  0.793634
    3  0.543382  0.656891  0.500179  0.046006  0.862769  0.651065
    4  0.422683  0.673702  0.223489  0.438760  0.468954  0.308509
    5  0.389390  0.764020  0.093050  0.100932  0.572475  0.416471
    6  0.449972  0.259181  0.248186  0.626101  0.556980  0.559413
    7  0.207592  0.400591  0.075461  0.096072  0.308755  0.157078
    8  0.471749  0.639745  0.368987  0.340573  0.997547  0.011892
    9  0.438500  0.050582  0.714160  0.168839  0.899230  0.359690
    

    The main problem with this approach is that calling the same code multiple times will create different results each time, so one needs to be careful :)

    0 讨论(0)
  • 2020-11-22 02:27

    Just assign the column names in the order you want them:

    In [39]: df
    Out[39]: 
              0         1         2         3         4  mean
    0  0.172742  0.915661  0.043387  0.712833  0.190717     1
    1  0.128186  0.424771  0.590779  0.771080  0.617472     1
    2  0.125709  0.085894  0.989798  0.829491  0.155563     1
    3  0.742578  0.104061  0.299708  0.616751  0.951802     1
    4  0.721118  0.528156  0.421360  0.105886  0.322311     1
    5  0.900878  0.082047  0.224656  0.195162  0.736652     1
    6  0.897832  0.558108  0.318016  0.586563  0.507564     1
    7  0.027178  0.375183  0.930248  0.921786  0.337060     1
    8  0.763028  0.182905  0.931756  0.110675  0.423398     1
    9  0.848996  0.310562  0.140873  0.304561  0.417808     1
    
    In [40]: df = df[['mean', 4,3,2,1]]
    

    Now, 'mean' column comes out in the front:

    In [41]: df
    Out[41]: 
       mean         4         3         2         1
    0     1  0.190717  0.712833  0.043387  0.915661
    1     1  0.617472  0.771080  0.590779  0.424771
    2     1  0.155563  0.829491  0.989798  0.085894
    3     1  0.951802  0.616751  0.299708  0.104061
    4     1  0.322311  0.105886  0.421360  0.528156
    5     1  0.736652  0.195162  0.224656  0.082047
    6     1  0.507564  0.586563  0.318016  0.558108
    7     1  0.337060  0.921786  0.930248  0.375183
    8     1  0.423398  0.110675  0.931756  0.182905
    9     1  0.417808  0.304561  0.140873  0.310562
    
    0 讨论(0)
  • 2020-11-22 02:28

    I think this is a slightly neater solution:

    df.insert(0,'mean', df.pop("mean"))
    

    This solution is somewhat similar to @JoeHeffer 's solution but this is one liner.

    Here we remove the column "mean" from the dataframe and attach it to index 0 with the same column name.

    0 讨论(0)
  • 2020-11-22 02:28

    @clocker: Your solution was very helpful for me, as I wanted to bring two columns in front from a dataframe where I do not know exactly the names of all columns, because they are generated from a pivot statement before. So, if you are in the same situation: To bring columns in front that you know the name of and then let them follow by "all the other columns", I came up with the following general solution;

    df = df.reindex_axis(['Col1','Col2'] + list(df.columns.drop(['Col1','Col2'])), axis=1)
    
    0 讨论(0)
  • 2020-11-22 02:29

    You can use a set which is an unordered collection of unique elements to do keep the "order of the other columns untouched":

    other_columns = list(set(df.columns).difference(["mean"])) #[0, 1, 2, 3, 4]
    

    Then, you can use a lambda to move a specific column to the front by:

    In [1]: import numpy as np                                                                               
    
    In [2]: import pandas as pd                                                                              
    
    In [3]: df = pd.DataFrame(np.random.rand(10, 5))                                                         
    
    In [4]: df["mean"] = df.mean(1)                                                                          
    
    In [5]: move_col_to_front = lambda df, col: df[[col]+list(set(df.columns).difference([col]))]            
    
    In [6]: move_col_to_front(df, "mean")                                                                    
    Out[6]: 
           mean         0         1         2         3         4
    0  0.697253  0.600377  0.464852  0.938360  0.945293  0.537384
    1  0.609213  0.703387  0.096176  0.971407  0.955666  0.319429
    2  0.561261  0.791842  0.302573  0.662365  0.728368  0.321158
    3  0.518720  0.710443  0.504060  0.663423  0.208756  0.506916
    4  0.616316  0.665932  0.794385  0.163000  0.664265  0.793995
    5  0.519757  0.585462  0.653995  0.338893  0.714782  0.305654
    6  0.532584  0.434472  0.283501  0.633156  0.317520  0.994271
    7  0.640571  0.732680  0.187151  0.937983  0.921097  0.423945
    8  0.562447  0.790987  0.200080  0.317812  0.641340  0.862018
    9  0.563092  0.811533  0.662709  0.396048  0.596528  0.348642
    
    In [7]: move_col_to_front(df, 2)                                                                         
    Out[7]: 
              2         0         1         3         4      mean
    0  0.938360  0.600377  0.464852  0.945293  0.537384  0.697253
    1  0.971407  0.703387  0.096176  0.955666  0.319429  0.609213
    2  0.662365  0.791842  0.302573  0.728368  0.321158  0.561261
    3  0.663423  0.710443  0.504060  0.208756  0.506916  0.518720
    4  0.163000  0.665932  0.794385  0.664265  0.793995  0.616316
    5  0.338893  0.585462  0.653995  0.714782  0.305654  0.519757
    6  0.633156  0.434472  0.283501  0.317520  0.994271  0.532584
    7  0.937983  0.732680  0.187151  0.921097  0.423945  0.640571
    8  0.317812  0.790987  0.200080  0.641340  0.862018  0.562447
    9  0.396048  0.811533  0.662709  0.596528  0.348642  0.563092
    
    0 讨论(0)
  • 2020-11-22 02:30

    Simply do,

    df = df[['mean'] + df.columns[:-1].tolist()]
    
    0 讨论(0)
提交回复
热议问题