finding all regex matches from a pandas dataframe column

后端 未结 1 450
囚心锁ツ
囚心锁ツ 2021-01-06 11:01

i am trying to extract some data from a dataframe, however following query only extract the first match and ignores the rest of the matches, for example if the entire data i

相关标签:
1条回答
  • 2021-01-06 11:22

    you can use Series.str.extractall() method:

    In [57]: x
    Out[57]:
                                                        value
    0  123 blah blah blah 456 blah blah blah 129kfj blah blah
    1  237 blah blah blah 438 blah blah blah 365kfj blah blah
    
    In [58]: x['newCol'] = x['value'].str.extractall(r'(\d{3})').unstack().apply(','.join, 1)
    
    In [59]: x
    Out[59]:
                                                        value       newCol
    0  123 blah blah blah 456 blah blah blah 129kfj blah blah  123,456,129
    1  237 blah blah blah 438 blah blah blah 365kfj blah blah  237,438,365
    

    UPDATE:

    In [77]: x
    Out[77]:
                                                          value
    0  123 blah blah blah, 456 blah blah blah, 129kfj blah blah
    1  237 blah blah blah, 438 blah blah blah, 365kfj blah blah
    
    In [78]: x['value'].str.extractall(r'(\d{3})').unstack().apply(','.join, 1)
    Out[78]:
    0    123,456,129
    1    237,438,365
    dtype: object
    
    0 讨论(0)
提交回复
热议问题