Renaming columns in pandas

前端未结

关注

 27  2405

野性不改

I have a DataFrame using pandas and column labels that I need to edit to replace the original column labels.

I\'d like to change the column names in a DataFrame

相关标签:

27条回答

死守一世寂寞

2020-11-21 07:23
You could use str.slice for that:
```
df.columns = df.columns.str.slice(1)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

迷失自我

2020-11-21 07:23

Another option is to rename using a regular expression:

import pandas as pd
import re

df = pd.DataFrame({'$a':[1,2], '$b':[3,4], '$c':[5,6]})

df = df.rename(columns=lambda x: re.sub('\$','',x))
>>> df
   a  b  c
0  1  3  5
1  2  4  6

0 讨论(0)

伪装坚强ぢ

2020-11-21 07:24
```
old_names = ['$a', '$b', '$c', '$d', '$e'] 
new_names = ['a', 'b', 'c', 'd', 'e']
df.rename(columns=dict(zip(old_names, new_names)), inplace=True)
```
This way you can manually edit the new_names as you wish. Works great when you need to rename only a few columns to correct mispellings, accents, remove special characters etc.
0 讨论(0)
发布评论:

提交评论
- 加载中...
暖寄归人

2020-11-21 07:26
```
df.rename(index=str,columns={'A':'a','B':'b'})
```
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rename.html
0 讨论(0)
发布评论:

提交评论
- 加载中...
隐瞒了意图╮

2020-11-21 07:28
Just assign it to the .columns attribute:
```
>>> df = pd.DataFrame({'$a':[1,2], '$b': [10,20]})
>>> df.columns = ['a', 'b']
>>> df
   a   b
0  1  10
1  2  20
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

失恋的感觉

2020-11-21 07:28

If you've got the dataframe, df.columns dumps everything into a list you can manipulate and then reassign into your dataframe as the names of columns...

columns = df.columns
columns = [row.replace("$","") for row in columns]
df.rename(columns=dict(zip(columns, things)), inplace=True)
df.head() #to validate the output

Best way? IDK. A way - yes.

A better way of evaluating all the main techniques put forward in the answers to the question is below using cProfile to gage memory & execution time. @kadee, @kaitlyn, & @eumiro had the functions with the fastest execution times - though these functions are so fast we're comparing the rounding of .000 and .001 seconds for all the answers. Moral: my answer above likely isn't the 'Best' way.

import pandas as pd
import cProfile, pstats, re

old_names = ['$a', '$b', '$c', '$d', '$e']
new_names = ['a', 'b', 'c', 'd', 'e']
col_dict = {'$a': 'a', '$b': 'b','$c':'c','$d':'d','$e':'e'}

df = pd.DataFrame({'$a':[1,2], '$b': [10,20],'$c':['bleep','blorp'],'$d':[1,2],'$e':['texa$','']})

df.head()

def eumiro(df,nn):
    df.columns = nn
    #This direct renaming approach is duplicated in methodology in several other answers: 
    return df

def lexual1(df):
    return df.rename(columns=col_dict)

def lexual2(df,col_dict):
    return df.rename(columns=col_dict, inplace=True)

def Panda_Master_Hayden(df):
    return df.rename(columns=lambda x: x[1:], inplace=True)

def paulo1(df):
    return df.rename(columns=lambda x: x.replace('$', ''))

def paulo2(df):
    return df.rename(columns=lambda x: x.replace('$', ''), inplace=True)

def migloo(df,on,nn):
    return df.rename(columns=dict(zip(on, nn)), inplace=True)

def kadee(df):
    return df.columns.str.replace('$','')

def awo(df):
    columns = df.columns
    columns = [row.replace("$","") for row in columns]
    return df.rename(columns=dict(zip(columns, '')), inplace=True)

def kaitlyn(df):
    df.columns = [col.strip('$') for col in df.columns]
    return df

print 'eumiro'
cProfile.run('eumiro(df,new_names)')
print 'lexual1'
cProfile.run('lexual1(df)')
print 'lexual2'
cProfile.run('lexual2(df,col_dict)')
print 'andy hayden'
cProfile.run('Panda_Master_Hayden(df)')
print 'paulo1'
cProfile.run('paulo1(df)')
print 'paulo2'
cProfile.run('paulo2(df)')
print 'migloo'
cProfile.run('migloo(df,old_names,new_names)')
print 'kadee'
cProfile.run('kadee(df)')
print 'awo'
cProfile.run('awo(df)')
print 'kaitlyn'
cProfile.run('kaitlyn(df)')

0 讨论(0)