matching rows between dataframes in pandas in python

前端 未结 2 1993
栀梦
栀梦 2020-12-06 23:36

I have two dataframes,

df1,

 Names
 one two three
 Sri is a good player
 Ravi is a mentor
 Kumar is a cricketer

df2,



        
相关标签:
2条回答
  • 2020-12-06 23:54

    Using sets

    s1 = df1.Names.dropna()
    s1.loc[:] = [set(x.lower().split()) for x in s1.values.tolist()]
    a1 = s1.values
    
    s2 = df2['values'].dropna()
    s2.loc[:] = [set(x.replace(' ', '').lower().split(',')) for x in s2.values.tolist()]
    a2 = s2.values
    
    i = np.column_stack([a1 >= a2[:, None], [True] * len(a2)]).argmax(1)
    
    df2.assign(Names=pd.Series(
        np.append(df1.Names.values, np.nan)[i], s2.index
    ))
    
                values                 Names
    0              sri  Sri is a good player
    1              NaN                   NaN
    2          sri, is  Sri is a good player
    3  kumar,cricketer  Kumar is a cricketer
    
    0 讨论(0)
  • 2020-12-06 23:54
    import pandas as pd
    names =  [
        'one two three',
        'Sri is a good player',
        'Ravi is a mentor',
        'Kumar is a cricketer'
    ]
    values = [
        'sri',
        'NaN',
        'sri, is',
        'kumar,cricketer',
    ]
    
    names = pd.Series(names)
    values = pd.DataFrame(values, columns=['values'])
    
    def foo(words):
        names_copy = names.copy()
    
        for word in words.split(','):
            names_copy = names_copy[names_copy.str.contains(word, case=False)]
    
        return names_copy.values
    
     values['names'] = values['values'].map(foo)
     values
    
    
        values          names
    0   sri             [Sri is a good player]
    1   NaN             []
    2   sri, is         [Sri is a good player]
    3   kumar,cricketer [Kumar is a cricketer]
    
    0 讨论(0)
提交回复
热议问题