I have a DataFrame using pandas and column labels that I need to edit to replace the original column labels.
I\'d like to change the column names in a DataFrame
Since you only want to remove the $ sign in all column names, you could just do:
df = df.rename(columns=lambda x: x.replace('$', ''))
OR
df.rename(columns=lambda x: x.replace('$', ''), inplace=True)
Assuming you can use regular expression. This solution removes the need of manual encoding using regex
import pandas as pd
import re
srch=re.compile(r"\w+")
data=pd.read_csv("CSV_FILE.csv")
cols=data.columns
new_cols=list(map(lambda v:v.group(),(list(map(srch.search,cols)))))
data.columns=new_cols
Use the df.rename() function and refer the columns to be renamed. Not all the columns have to be renamed:
df = df.rename(columns={'oldName1': 'newName1', 'oldName2': 'newName2'})
# Or rename the existing DataFrame (rather than creating a copy)
df.rename(columns={'oldName1': 'newName1', 'oldName2': 'newName2'}, inplace=True)
Minimal Code Example
df = pd.DataFrame('x', index=range(3), columns=list('abcde'))
df
a b c d e
0 x x x x x
1 x x x x x
2 x x x x x
The following methods all work and produce the same output:
df2 = df.rename({'a': 'X', 'b': 'Y'}, axis=1) # new method
df2 = df.rename({'a': 'X', 'b': 'Y'}, axis='columns')
df2 = df.rename(columns={'a': 'X', 'b': 'Y'}) # old method
df2
X Y c d e
0 x x x x x
1 x x x x x
2 x x x x x
Remember to assign the result back, as the modification is not-inplace. Alternatively, specify inplace=True
:
df.rename({'a': 'X', 'b': 'Y'}, axis=1, inplace=True)
df
X Y c d e
0 x x x x x
1 x x x x x
2 x x x x x
From v0.25, you can also specify errors='raise'
to raise errors if an invalid column-to-rename is specified. See v0.25 rename() docs.
Use df.set_axis() with axis=1
and inplace=False
(to return a copy).
df2 = df.set_axis(['V', 'W', 'X', 'Y', 'Z'], axis=1, inplace=False)
df2
V W X Y Z
0 x x x x x
1 x x x x x
2 x x x x x
This returns a copy, but you can modify the DataFrame in-place by setting inplace=True
(this is the default behaviour for versions <=0.24 but is likely to change in the future).
You can also assign headers directly:
df.columns = ['V', 'W', 'X', 'Y', 'Z']
df
V W X Y Z
0 x x x x x
1 x x x x x
2 x x x x x
Let's say this is your dataframe.
You can rename the columns using two methods.
Using dataframe.columns=[#list]
df.columns=['a','b','c','d','e']
The limitation of this method is that if one column has to be changed, full column list has to be passed. Also, this method is not applicable on index labels. For example, if you passed this:
df.columns = ['a','b','c','d']
This will throw an error. Length mismatch: Expected axis has 5 elements, new values have 4 elements.
Another method is the Pandas rename()
method which is used to rename any index, column or row
df = df.rename(columns={'$a':'a'})
Similarly, you can change any rows or columns.
The rename method can take a function, for example:
In [11]: df.columns
Out[11]: Index([u'$a', u'$b', u'$c', u'$d', u'$e'], dtype=object)
In [12]: df.rename(columns=lambda x: x[1:], inplace=True)
In [13]: df.columns
Out[13]: Index([u'a', u'b', u'c', u'd', u'e'], dtype=object)
As documented in Working with text data:
df.columns = df.columns.str.replace('$','')