Renaming columns in pandas

前端未结

关注

 27  2634

野性不改 2020-11-21 07:05

I have a DataFrame using pandas and column labels that I need to edit to replace the original column labels.

I\'d like to change the column names in a DataFrame

27条回答

闹比i (楼主)

2020-11-21 07:33
Column names vs Names of Series

I would like to explain a bit what happens behind the scenes.

Dataframes are a set of Series.

Series in turn are an extension of a numpy.array

numpy.arrays have a property .name

This is the name of the series. It is seldom that pandas respects this attribute, but it lingers in places and can be used to hack some pandas behaviors.

Naming the list of columns

A lot of answers here talks about the df.columns attribute being a list when in fact it is a Series. This means it has a .name attribute.

This is what happens if you decide to fill in the name of the columns Series:
```
df.columns = ['column_one', 'column_two']
df.columns.names = ['name of the list of columns']
df.index.names = ['name of the index']

name of the list of columns     column_one  column_two
name of the index       
0                                    4           1
1                                    5           2
2                                    6           3
```
Note that the name of the index always comes one column lower.

Artifacts that linger

The .name attribute lingers on sometimes. If you set df.columns = ['one', 'two'] then the df.one.name will be 'one'.

If you set df.one.name = 'three' then df.columns will still give you ['one', 'two'], and df.one.name will give you 'three'

BUT

pd.DataFrame(df.one) will return
```
    three
0       1
1       2
2       3
```
Because pandas reuses the .name of the already defined Series.

Multi level column names

Pandas has ways of doing multi layered column names. There is not so much magic involved but I wanted to cover this in my answer too since I don't see anyone picking up on this here.
```
    |one            |
    |one      |two  |
0   |  4      |  1  |
1   |  5      |  2  |
2   |  6      |  3  |
```
This is easily achievable by setting columns to lists, like this:
```
df.columns = [['one', 'one'], ['one', 'two']]
```
0 讨论(0)

查看其它27个回答
发布评论:

提交评论
- 加载中...

Renaming columns in pandas

Column names vs Names of Series

Naming the list of columns

Artifacts that linger

BUT

Multi level column names