Renaming columns in pandas

前端 未结 27 2552
野性不改
野性不改 2020-11-21 07:05

I have a DataFrame using pandas and column labels that I need to edit to replace the original column labels.

I\'d like to change the column names in a DataFrame

相关标签:
27条回答
  • 2020-11-21 07:37

    Pandas 0.21+ Answer

    There have been some significant updates to column renaming in version 0.21.

    • The rename method has added the axis parameter which may be set to columns or 1. This update makes this method match the rest of the pandas API. It still has the index and columns parameters but you are no longer forced to use them.
    • The set_axis method with the inplace set to False enables you to rename all the index or column labels with a list.

    Examples for Pandas 0.21+

    Construct sample DataFrame:

    df = pd.DataFrame({'$a':[1,2], '$b': [3,4], 
                       '$c':[5,6], '$d':[7,8], 
                       '$e':[9,10]})
    
       $a  $b  $c  $d  $e
    0   1   3   5   7   9
    1   2   4   6   8  10
    

    Using rename with axis='columns' or axis=1

    df.rename({'$a':'a', '$b':'b', '$c':'c', '$d':'d', '$e':'e'}, axis='columns')
    

    or

    df.rename({'$a':'a', '$b':'b', '$c':'c', '$d':'d', '$e':'e'}, axis=1)
    

    Both result in the following:

       a  b  c  d   e
    0  1  3  5  7   9
    1  2  4  6  8  10
    

    It is still possible to use the old method signature:

    df.rename(columns={'$a':'a', '$b':'b', '$c':'c', '$d':'d', '$e':'e'})
    

    The rename function also accepts functions that will be applied to each column name.

    df.rename(lambda x: x[1:], axis='columns')
    

    or

    df.rename(lambda x: x[1:], axis=1)
    

    Using set_axis with a list and inplace=False

    You can supply a list to the set_axis method that is equal in length to the number of columns (or index). Currently, inplace defaults to True, but inplace will be defaulted to False in future releases.

    df.set_axis(['a', 'b', 'c', 'd', 'e'], axis='columns', inplace=False)
    

    or

    df.set_axis(['a', 'b', 'c', 'd', 'e'], axis=1, inplace=False)
    

    Why not use df.columns = ['a', 'b', 'c', 'd', 'e']?

    There is nothing wrong with assigning columns directly like this. It is a perfectly good solution.

    The advantage of using set_axis is that it can be used as part of a method chain and that it returns a new copy of the DataFrame. Without it, you would have to store your intermediate steps of the chain to another variable before reassigning the columns.

    # new for pandas 0.21+
    df.some_method1()
      .some_method2()
      .set_axis()
      .some_method3()
    
    # old way
    df1 = df.some_method1()
            .some_method2()
    df1.columns = columns
    df1.some_method3()
    
    0 讨论(0)
  • 2020-11-21 07:37

    If you have to deal with loads of columns named by the providing system out of your control, I came up with the following approach that is a combination of a general approach and specific replacments in one go.

    First create a dictionary from the dataframe column names using regex expressions in order to throw away certain appendixes of column names and then add specific replacements to the dictionary to name core columns as expected later in the receiving database.

    This is then applied to the dataframe in one go.

    dict=dict(zip(df.columns,df.columns.str.replace('(:S$|:C1$|:L$|:D$|\.Serial:L$)','')))
    dict['brand_timeseries:C1']='BTS'
    dict['respid:L']='RespID'
    dict['country:C1']='CountryID'
    dict['pim1:D']='pim_actual'
    df.rename(columns=dict, inplace=True)
    
    0 讨论(0)
  • 2020-11-21 07:37

    I needed to rename features for XGBoost, it didn't like any of these:

    import re
    regex = r"[!\"#$%&'()*+,\-.\/:;<=>?@[\\\]^_`{|}~ ]+"
    X_trn.columns = X_trn.columns.str.replace(regex, '_', regex=True)
    X_tst.columns = X_tst.columns.str.replace(regex, '_', regex=True)
    
    0 讨论(0)
提交回复
热议问题