Remap values in pandas column with a dict

后端 未结 10 1099
囚心锁ツ
囚心锁ツ 2020-11-21 05:14

I have a dictionary which looks like this: di = {1: \"A\", 2: \"B\"}

I would like to apply it to the \"col1\" column of a dataframe similar to:

相关标签:
10条回答
  • 2020-11-21 05:47

    Adding to this question if you ever have more than one columns to remap in a data dataframe:

    def remap(data,dict_labels):
        """
        This function take in a dictionnary of labels : dict_labels 
        and replace the values (previously labelencode) into the string.
    
        ex: dict_labels = {{'col1':{1:'A',2:'B'}}
    
        """
        for field,values in dict_labels.items():
            print("I am remapping %s"%field)
            data.replace({field:values},inplace=True)
        print("DONE")
    
        return data
    

    Hope it can be useful to someone.

    Cheers

    0 讨论(0)
  • 2020-11-21 05:47

    A nice complete solution that keeps a map of your class labels:

    labels = features['col1'].unique()
    labels_dict = dict(zip(labels, range(len(labels))))
    features = features.replace({"col1": labels_dict})
    

    This way, you can at any point refer to the original class label from labels_dict.

    0 讨论(0)
  • 2020-11-21 05:51

    Or do apply:

    df['col1'].apply(lambda x: {1: "A", 2: "B"}.get(x,x))
    

    Demo:

    >>> df['col1']=df['col1'].apply(lambda x: {1: "A", 2: "B"}.get(x,x))
    >>> df
      col1 col2
    0    w    a
    1    1    2
    2    2  NaN
    >>> 
    
    0 讨论(0)
  • 2020-11-21 05:51

    A more native pandas approach is to apply a replace function as below:

    def multiple_replace(dict, text):
      # Create a regular expression  from the dictionary keys
      regex = re.compile("(%s)" % "|".join(map(re.escape, dict.keys())))
    
      # For each match, look-up corresponding value in dictionary
      return regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], text) 
    

    Once you defined the function, you can apply it to your dataframe.

    di = {1: "A", 2: "B"}
    df['col1'] = df.apply(lambda row: multiple_replace(di, row['col1']), axis=1)
    
    0 讨论(0)
提交回复
热议问题