is it possible to do fuzzy match merge with python pandas?

前端 未结 11 1464
[愿得一人]
[愿得一人] 2020-11-22 01:17

I have two DataFrames which I want to merge based on a column. However, due to alternate spellings, different number of spaces, absence/presence of diacritical marks, I woul

11条回答
  •  伪装坚强ぢ
    2020-11-22 02:03

    As a heads up, this basically works, except if no match is found, or if you have NaNs in either column. Instead of directly applying get_close_matches, I found it easier to apply the following function. The choice of NaN replacements will depend a lot on your dataset.

    def fuzzy_match(a, b):
        left = '1' if pd.isnull(a) else a
        right = b.fillna('2')
        out = difflib.get_close_matches(left, right)
        return out[0] if out else np.NaN
    

提交回复
热议问题