How to change the order of DataFrame columns?

前端 未结 30 1553
南旧
南旧 2020-11-22 01:24

I have the following DataFrame (df):

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(10, 5))
相关标签:
30条回答
  • 2020-11-22 02:20

    This function avoids you having to list out every variable in your dataset just to order a few of them.

    def order(frame,var):
        if type(var) is str:
            var = [var] #let the command take a string or list
        varlist =[w for w in frame.columns if w not in var]
        frame = frame[var+varlist]
        return frame 
    

    It takes two arguments, the first is the dataset, the second are the columns in the data set that you want to bring to the front.

    So in my case I have a data set called Frame with variables A1, A2, B1, B2, Total and Date. If I want to bring Total to the front then all I have to do is:

    frame = order(frame,['Total'])
    

    If I want to bring Total and Date to the front then I do:

    frame = order(frame,['Total','Date'])
    

    EDIT:

    Another useful way to use this is, if you have an unfamiliar table and you're looking with variables with a particular term in them, like VAR1, VAR2,... you may execute something like:

    frame = order(frame,[v for v in frame.columns if "VAR" in v])
    
    0 讨论(0)
  • 2020-11-22 02:20

    Hackiest method in the book

    df.insert(0,"test",df["mean"])
    df=df.drop(columns=["mean"]).rename(columns={"test":"mean"})
    
    0 讨论(0)
  • 2020-11-22 02:22

    One easy way would be to reassign the dataframe with a list of the columns, rearranged as needed.

    This is what you have now:

    In [6]: df
    Out[6]:
              0         1         2         3         4      mean
    0  0.445598  0.173835  0.343415  0.682252  0.582616  0.445543
    1  0.881592  0.696942  0.702232  0.696724  0.373551  0.670208
    2  0.662527  0.955193  0.131016  0.609548  0.804694  0.632596
    3  0.260919  0.783467  0.593433  0.033426  0.512019  0.436653
    4  0.131842  0.799367  0.182828  0.683330  0.019485  0.363371
    5  0.498784  0.873495  0.383811  0.699289  0.480447  0.587165
    6  0.388771  0.395757  0.745237  0.628406  0.784473  0.588529
    7  0.147986  0.459451  0.310961  0.706435  0.100914  0.345149
    8  0.394947  0.863494  0.585030  0.565944  0.356561  0.553195
    9  0.689260  0.865243  0.136481  0.386582  0.730399  0.561593
    
    In [7]: cols = df.columns.tolist()
    
    In [8]: cols
    Out[8]: [0L, 1L, 2L, 3L, 4L, 'mean']
    

    Rearrange cols in any way you want. This is how I moved the last element to the first position:

    In [12]: cols = cols[-1:] + cols[:-1]
    
    In [13]: cols
    Out[13]: ['mean', 0L, 1L, 2L, 3L, 4L]
    

    Then reorder the dataframe like this:

    In [16]: df = df[cols]  #    OR    df = df.ix[:, cols]
    
    In [17]: df
    Out[17]:
           mean         0         1         2         3         4
    0  0.445543  0.445598  0.173835  0.343415  0.682252  0.582616
    1  0.670208  0.881592  0.696942  0.702232  0.696724  0.373551
    2  0.632596  0.662527  0.955193  0.131016  0.609548  0.804694
    3  0.436653  0.260919  0.783467  0.593433  0.033426  0.512019
    4  0.363371  0.131842  0.799367  0.182828  0.683330  0.019485
    5  0.587165  0.498784  0.873495  0.383811  0.699289  0.480447
    6  0.588529  0.388771  0.395757  0.745237  0.628406  0.784473
    7  0.345149  0.147986  0.459451  0.310961  0.706435  0.100914
    8  0.553195  0.394947  0.863494  0.585030  0.565944  0.356561
    9  0.561593  0.689260  0.865243  0.136481  0.386582  0.730399
    
    0 讨论(0)
  • 2020-11-22 02:24

    You need to create a new list of your columns in the desired order, then use df = df[cols] to rearrange the columns in this new order.

    cols = ['mean']  + [col for col in df if col != 'mean']
    df = df[cols]
    

    You can also use a more general approach. In this example, the last column (indicated by -1) is inserted as the first column.

    cols = [df.columns[-1]] + [col for col in df if col != df.columns[-1]]
    df = df[cols]
    

    You can also use this approach for reordering columns in a desired order if they are present in the DataFrame.

    inserted_cols = ['a', 'b', 'c']
    cols = ([col for col in inserted_cols if col in df] 
            + [col for col in df if col not in inserted_cols])
    df = df[cols]
    
    0 讨论(0)
  • 2020-11-22 02:24

    Here is a very simple answer to this(only one line).

    You can do that after you added the 'n' column into your df as follows.

    import numpy as np
    import pandas as pd
    
    df = pd.DataFrame(np.random.rand(10, 5))
    df['mean'] = df.mean(1)
    df
               0           1           2           3           4        mean
    0   0.929616    0.316376    0.183919    0.204560    0.567725    0.440439
    1   0.595545    0.964515    0.653177    0.748907    0.653570    0.723143
    2   0.747715    0.961307    0.008388    0.106444    0.298704    0.424512
    3   0.656411    0.809813    0.872176    0.964648    0.723685    0.805347
    4   0.642475    0.717454    0.467599    0.325585    0.439645    0.518551
    5   0.729689    0.994015    0.676874    0.790823    0.170914    0.672463
    6   0.026849    0.800370    0.903723    0.024676    0.491747    0.449473
    7   0.526255    0.596366    0.051958    0.895090    0.728266    0.559587
    8   0.818350    0.500223    0.810189    0.095969    0.218950    0.488736
    9   0.258719    0.468106    0.459373    0.709510    0.178053    0.414752
    
    
    ### here you can add below line and it should work 
    # Don't forget the two (()) 'brackets' around columns names.Otherwise, it'll give you an error.
    
    df = df[list(('mean',0, 1, 2,3,4))]
    df
    
            mean           0           1           2           3           4
    0   0.440439    0.929616    0.316376    0.183919    0.204560    0.567725
    1   0.723143    0.595545    0.964515    0.653177    0.748907    0.653570
    2   0.424512    0.747715    0.961307    0.008388    0.106444    0.298704
    3   0.805347    0.656411    0.809813    0.872176    0.964648    0.723685
    4   0.518551    0.642475    0.717454    0.467599    0.325585    0.439645
    5   0.672463    0.729689    0.994015    0.676874    0.790823    0.170914
    6   0.449473    0.026849    0.800370    0.903723    0.024676    0.491747
    7   0.559587    0.526255    0.596366    0.051958    0.895090    0.728266
    8   0.488736    0.818350    0.500223    0.810189    0.095969    0.218950
    9   0.414752    0.258719    0.468106    0.459373    0.709510    0.178053
    
    
    0 讨论(0)
  • 2020-11-22 02:25

    I ran into a similar question myself, and just wanted to add what I settled on. I liked the reindex_axis() method for changing column order. This worked:

    df = df.reindex_axis(['mean'] + list(df.columns[:-1]), axis=1)
    

    An alternate method based on the comment from @Jorge:

    df = df.reindex(columns=['mean'] + list(df.columns[:-1]))
    

    Although reindex_axis seems to be slightly faster in micro benchmarks than reindex, I think I prefer the latter for its directness.

    0 讨论(0)
提交回复
热议问题