Change values in a column from a list

柔情痞子 提交于 2021-01-30 09:08:50

问题


I've got a dataframe with my index 'Country' I want to change the name of multiple countries, I have the old/new values in a dictionary, like below:

I tried splitting the values in a from list and to list, and that wouldn't work either. The code doesn't error, but the values in my dataframe haven't changed.

`import pandas as pd
import numpy as np

energy = (pd.read_excel('Energy Indicators.xls', 
                        skiprows=17, 
                        skip_footer=38))

energy = (energy.drop(energy.columns[[0, 1]], axis=1))
energy.columns = ['Country', 'Energy Supply', 'Energy Supply per Capita', '% Renewable']          
energy['Energy Supply'] = energy['Energy Supply'].apply(lambda x: x*1000000)

#This code isn't working properly
energy['Country'] = energy['Country'].replace({'China, Hong Kong Special Administrative Region':'Hong Kong', 'United Kingdom of Great Britain and Northern Ireland':'United Kingdom', 'Republic of Korea':'South Korea', 'United States of America':'United States', 'Iran (Islamic Republic of)':'Iran'})`

SOLVED: This was a problem with the data that I hadn't noticed.

energy['Country'] = (energy['Country'].str.replace('\s*\(.*?\)\s*', '').str.replace('\d+',''))

This line was sat under the 'problem' line, when actually it was required to clean it up before the replace could work. eg. United States of America20 was actually in the excel file so replace skipped right over it

Thanks for your help!!


回答1:


You need remove supercript by replace:

d = {'China, Hong Kong Special Administrative Region':'Hong Kong', 
     'United Kingdom of Great Britain and Northern Ireland':'United Kingdom', 
     'Republic of Korea':'South Korea', 'United States of America':'United States', 
     'Iran (Islamic Republic of)':'Iran'}

energy['Country'] = energy['Country'].str.replace('\d+', '').replace(d)

Also you can improve your solution - use parameter usecols for filtering columns and names for set new column names:

names = ['Country', 'Energy Supply', 'Energy Supply per Capita', '% Renewable']

energy = pd.read_excel('Energy Indicators.xls', 
                        skiprows=17, 
                        skip_footer=38,
                        usecols=range(2,6), 
                        names=names)


d = {'China, Hong Kong Special Administrative Region':'Hong Kong', 
     'United Kingdom of Great Britain and Northern Ireland':'United Kingdom', 
     'Republic of Korea':'South Korea', 'United States of America':'United States', 
     'Iran (Islamic Republic of)':'Iran'}

#for multiple is faster use *
energy['Energy Supply'] = energy['Energy Supply'] * 1000000
energy['Country'] = energy['Country'].str.replace('\d', '').replace(d)
#print (energy)


来源:https://stackoverflow.com/questions/43310228/change-values-in-a-column-from-a-list

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!