问题
I've got a dataframe with my index 'Country' I want to change the name of multiple countries, I have the old/new values in a dictionary, like below:
I tried splitting the values in a from list and to list, and that wouldn't work either. The code doesn't error, but the values in my dataframe haven't changed.
`import pandas as pd
import numpy as np
energy = (pd.read_excel('Energy Indicators.xls',
skiprows=17,
skip_footer=38))
energy = (energy.drop(energy.columns[[0, 1]], axis=1))
energy.columns = ['Country', 'Energy Supply', 'Energy Supply per Capita', '% Renewable']
energy['Energy Supply'] = energy['Energy Supply'].apply(lambda x: x*1000000)
#This code isn't working properly
energy['Country'] = energy['Country'].replace({'China, Hong Kong Special Administrative Region':'Hong Kong', 'United Kingdom of Great Britain and Northern Ireland':'United Kingdom', 'Republic of Korea':'South Korea', 'United States of America':'United States', 'Iran (Islamic Republic of)':'Iran'})`
SOLVED: This was a problem with the data that I hadn't noticed.
energy['Country'] = (energy['Country'].str.replace('\s*\(.*?\)\s*', '').str.replace('\d+',''))
This line was sat under the 'problem' line, when actually it was required to clean it up before the replace could work. eg. United States of America20 was actually in the excel file so replace skipped right over it
Thanks for your help!!
回答1:
You need remove supercript by replace:
d = {'China, Hong Kong Special Administrative Region':'Hong Kong',
'United Kingdom of Great Britain and Northern Ireland':'United Kingdom',
'Republic of Korea':'South Korea', 'United States of America':'United States',
'Iran (Islamic Republic of)':'Iran'}
energy['Country'] = energy['Country'].str.replace('\d+', '').replace(d)
Also you can improve your solution - use parameter usecols
for filtering columns and names
for set new column names:
names = ['Country', 'Energy Supply', 'Energy Supply per Capita', '% Renewable']
energy = pd.read_excel('Energy Indicators.xls',
skiprows=17,
skip_footer=38,
usecols=range(2,6),
names=names)
d = {'China, Hong Kong Special Administrative Region':'Hong Kong',
'United Kingdom of Great Britain and Northern Ireland':'United Kingdom',
'Republic of Korea':'South Korea', 'United States of America':'United States',
'Iran (Islamic Republic of)':'Iran'}
#for multiple is faster use *
energy['Energy Supply'] = energy['Energy Supply'] * 1000000
energy['Country'] = energy['Country'].str.replace('\d', '').replace(d)
#print (energy)
来源:https://stackoverflow.com/questions/43310228/change-values-in-a-column-from-a-list