Remove last two characters from column names of all the columns in Dataframe - Pandas

后端未结

关注

 2  1832

I am joining the two dataframes (a,b) with identical columns / column names using the user ID key and while joining, I had to give suffix characters, in order for it to get

相关标签:

2条回答

攒了一身酷

2021-01-19 02:17
This snippet should get the job done :
```
df.columns = pd.Index(map(lambda x : str(x)[:-2], df.columns))
```
Edit : This is a better way to do it
```
df.rename(columns = lambda x : str(x)[:-2])
```
In both cases, all we're doing is iterating through the columns and apply some function. In this case, the function converts something into a string and takes everything up until the last two characters.

I'm sure there are a few other ways you could do this.
0 讨论(0)
发布评论:

提交评论
- 加载中...

野性不改

2021-01-19 02:17

You could use str.rstrip like so

In [214]: import functools as ft

In [215]: f = ft.partial(np.random.choice, *[5, 3])

In [225]: df = pd.DataFrame({'a': f(), 'b': f(), 'c': f(), 'a_1': f(), 'b_1': f(), 'c_1': f()})

In [226]: df
Out[226]:
   a  b  c  a_1  b_1  c_1
0  4  2  0    2    3    2
1  0  0  3    2    1    1
2  4  0  4    4    4    3

In [227]: df.columns = df.columns.str.rstrip('_1')

In [228]: df
Out[228]:
   a  b  c  a  b  c
0  4  2  0  2  3  2
1  0  0  3  2  1  1
2  4  0  4  4  4  3

However if you need something more flexible (albeit probably a bit slower), you can use str.extract which, with the power of regexes, will allow you to select which part of the column name you would like to keep

In [216]: df = pd.DataFrame({f'{c}_{i}': f() for i in range(3) for c in 'abc'})

In [217]: df
Out[217]:
   a_0  b_0  c_0  a_1  b_1  c_1  a_2  b_2  c_2
0    0    1    0    2    2    4    0    0    3
1    0    0    3    1    4    2    4    3    2
2    2    0    1    0    0    2    2    2    1

In [223]: df.columns = df.columns.str.extract(r'(.*)_\d+')[0]

In [224]: df
Out[224]:
0  a  b  c  a  b  c  a  b  c
0  1  1  0  0  0  2  1  1  2
1  1  0  1  0  1  2  0  4  1
2  1  3  1  3  4  2  0  1  1

Idea to use df.columns.str came from this answer

0 讨论(0)