Renaming columns in pandas

前端 未结 27 2553
野性不改
野性不改 2020-11-21 07:05

I have a DataFrame using pandas and column labels that I need to edit to replace the original column labels.

I\'d like to change the column names in a DataFrame

相关标签:
27条回答
  • 2020-11-21 07:23

    You could use str.slice for that:

    df.columns = df.columns.str.slice(1)
    
    0 讨论(0)
  • 2020-11-21 07:23

    Another option is to rename using a regular expression:

    import pandas as pd
    import re
    
    df = pd.DataFrame({'$a':[1,2], '$b':[3,4], '$c':[5,6]})
    
    df = df.rename(columns=lambda x: re.sub('\$','',x))
    >>> df
       a  b  c
    0  1  3  5
    1  2  4  6
    
    0 讨论(0)
  • 2020-11-21 07:24
    old_names = ['$a', '$b', '$c', '$d', '$e'] 
    new_names = ['a', 'b', 'c', 'd', 'e']
    df.rename(columns=dict(zip(old_names, new_names)), inplace=True)
    

    This way you can manually edit the new_names as you wish. Works great when you need to rename only a few columns to correct mispellings, accents, remove special characters etc.

    0 讨论(0)
  • 2020-11-21 07:26
    df.rename(index=str,columns={'A':'a','B':'b'})
    

    https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rename.html

    0 讨论(0)
  • 2020-11-21 07:28

    Just assign it to the .columns attribute:

    >>> df = pd.DataFrame({'$a':[1,2], '$b': [10,20]})
    >>> df.columns = ['a', 'b']
    >>> df
       a   b
    0  1  10
    1  2  20
    
    0 讨论(0)
  • 2020-11-21 07:28

    If you've got the dataframe, df.columns dumps everything into a list you can manipulate and then reassign into your dataframe as the names of columns...

    columns = df.columns
    columns = [row.replace("$","") for row in columns]
    df.rename(columns=dict(zip(columns, things)), inplace=True)
    df.head() #to validate the output
    

    Best way? IDK. A way - yes.

    A better way of evaluating all the main techniques put forward in the answers to the question is below using cProfile to gage memory & execution time. @kadee, @kaitlyn, & @eumiro had the functions with the fastest execution times - though these functions are so fast we're comparing the rounding of .000 and .001 seconds for all the answers. Moral: my answer above likely isn't the 'Best' way.

    import pandas as pd
    import cProfile, pstats, re
    
    old_names = ['$a', '$b', '$c', '$d', '$e']
    new_names = ['a', 'b', 'c', 'd', 'e']
    col_dict = {'$a': 'a', '$b': 'b','$c':'c','$d':'d','$e':'e'}
    
    df = pd.DataFrame({'$a':[1,2], '$b': [10,20],'$c':['bleep','blorp'],'$d':[1,2],'$e':['texa$','']})
    
    df.head()
    
    def eumiro(df,nn):
        df.columns = nn
        #This direct renaming approach is duplicated in methodology in several other answers: 
        return df
    
    def lexual1(df):
        return df.rename(columns=col_dict)
    
    def lexual2(df,col_dict):
        return df.rename(columns=col_dict, inplace=True)
    
    def Panda_Master_Hayden(df):
        return df.rename(columns=lambda x: x[1:], inplace=True)
    
    def paulo1(df):
        return df.rename(columns=lambda x: x.replace('$', ''))
    
    def paulo2(df):
        return df.rename(columns=lambda x: x.replace('$', ''), inplace=True)
    
    def migloo(df,on,nn):
        return df.rename(columns=dict(zip(on, nn)), inplace=True)
    
    def kadee(df):
        return df.columns.str.replace('$','')
    
    def awo(df):
        columns = df.columns
        columns = [row.replace("$","") for row in columns]
        return df.rename(columns=dict(zip(columns, '')), inplace=True)
    
    def kaitlyn(df):
        df.columns = [col.strip('$') for col in df.columns]
        return df
    
    print 'eumiro'
    cProfile.run('eumiro(df,new_names)')
    print 'lexual1'
    cProfile.run('lexual1(df)')
    print 'lexual2'
    cProfile.run('lexual2(df,col_dict)')
    print 'andy hayden'
    cProfile.run('Panda_Master_Hayden(df)')
    print 'paulo1'
    cProfile.run('paulo1(df)')
    print 'paulo2'
    cProfile.run('paulo2(df)')
    print 'migloo'
    cProfile.run('migloo(df,old_names,new_names)')
    print 'kadee'
    cProfile.run('kadee(df)')
    print 'awo'
    cProfile.run('awo(df)')
    print 'kaitlyn'
    cProfile.run('kaitlyn(df)')
    
    0 讨论(0)
提交回复
热议问题