I have a dictionary which looks like this: di = {1: \"A\", 2: \"B\"}
I would like to apply it to the \"col1\" column of a dataframe similar to:
Adding to this question if you ever have more than one columns to remap in a data dataframe:
def remap(data,dict_labels):
"""
This function take in a dictionnary of labels : dict_labels
and replace the values (previously labelencode) into the string.
ex: dict_labels = {{'col1':{1:'A',2:'B'}}
"""
for field,values in dict_labels.items():
print("I am remapping %s"%field)
data.replace({field:values},inplace=True)
print("DONE")
return data
Hope it can be useful to someone.
Cheers
A nice complete solution that keeps a map of your class labels:
labels = features['col1'].unique()
labels_dict = dict(zip(labels, range(len(labels))))
features = features.replace({"col1": labels_dict})
This way, you can at any point refer to the original class label from labels_dict.
Or do apply
:
df['col1'].apply(lambda x: {1: "A", 2: "B"}.get(x,x))
Demo:
>>> df['col1']=df['col1'].apply(lambda x: {1: "A", 2: "B"}.get(x,x))
>>> df
col1 col2
0 w a
1 1 2
2 2 NaN
>>>
A more native pandas approach is to apply a replace function as below:
def multiple_replace(dict, text):
# Create a regular expression from the dictionary keys
regex = re.compile("(%s)" % "|".join(map(re.escape, dict.keys())))
# For each match, look-up corresponding value in dictionary
return regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], text)
Once you defined the function, you can apply it to your dataframe.
di = {1: "A", 2: "B"}
df['col1'] = df.apply(lambda row: multiple_replace(di, row['col1']), axis=1)