Pandas/Python: Replace multiple values in multiple columns

青春壹個敷衍的年華 提交于 2019-12-13 02:34:51

问题


All, I have an analytical csv file with 190 columns and 902 rows. I need to recode values in several columns (18 to be exact) from it's current 1-5 Likert scaling to 0-4 Likert scaling.

I've tried using replace:

df.replace({'Job_Performance1': {1:0, 2:1, 3:2, 4:3, 5:4}}, inplace=True)

But that throws a Value Error: "Replacement not allowed with overlapping keys and values"

I can use map:

df['job_perf1'] = df.Job_Performance1.map({1:0, 2:1, 3:2, 4:3, 5:4})

But, I know there has to be a more efficient way to accomplish this since this use case is standard in statistical analysis and statistical software e.g. SPSS

I've reviewed multiple questions on StackOverFlow but none of them quite fit my use case. e.g. Pandas - replacing column values, pandas replace multiple values one column, Python pandas: replace values multiple columns matching multiple columns from another dataframe

Suggestions?


回答1:


You can simply subtract a scalar value from your column which is in effect what you're doing here:

df['job_perf1'] = df['job_perf1'] - 1

Also as you need to do this on 18 cols, then I'd construct a list of the 18 column names and just subtract 1 from all of them at once:

df[col_list] = df[col_list] - 1



回答2:


No need for a mapping. This can be done as a vector addition, since effectively, what you're doing, is subtracting 1 from each value. This works elegantly:

df['job_perf1'] = df['Job_Performance1'] - numpy.ones(len(df['Job_Performance1']))

Or, without numpy:

df['job_perf1'] = df['Job_Performance1'] - [1] * len(df['Job_Performance1'])


来源:https://stackoverflow.com/questions/34426321/pandas-python-replace-multiple-values-in-multiple-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!