pandas: How do I split text in a column into multiple rows?

后端 未结 7 1161
说谎
说谎 2020-11-22 09:47

I\'m working with a large csv file and the next to last column has a string of text that I want to split by a specific delimiter. I was wondering if there is a simple way to

相关标签:
7条回答
  • 2020-11-22 10:29

    This splits the Seatblocks by space and gives each its own row.

    In [43]: df
    Out[43]: 
       CustNum     CustomerName  ItemQty Item                 Seatblocks  ItemExt
    0    32363  McCartney, Paul        3  F04               2:218:10:4,6       60
    1    31316     Lennon, John       25  F01  1:13:36:1,12 1:13:37:1,13      300
    
    In [44]: s = df['Seatblocks'].str.split(' ').apply(Series, 1).stack()
    
    In [45]: s.index = s.index.droplevel(-1) # to line up with df's index
    
    In [46]: s.name = 'Seatblocks' # needs a name to join
    
    In [47]: s
    Out[47]: 
    0    2:218:10:4,6
    1    1:13:36:1,12
    1    1:13:37:1,13
    Name: Seatblocks, dtype: object
    
    In [48]: del df['Seatblocks']
    
    In [49]: df.join(s)
    Out[49]: 
       CustNum     CustomerName  ItemQty Item  ItemExt    Seatblocks
    0    32363  McCartney, Paul        3  F04       60  2:218:10:4,6
    1    31316     Lennon, John       25  F01      300  1:13:36:1,12
    1    31316     Lennon, John       25  F01      300  1:13:37:1,13
    

    Or, to give each colon-separated string in its own column:

    In [50]: df.join(s.apply(lambda x: Series(x.split(':'))))
    Out[50]: 
       CustNum     CustomerName  ItemQty Item  ItemExt  0    1   2     3
    0    32363  McCartney, Paul        3  F04       60  2  218  10   4,6
    1    31316     Lennon, John       25  F01      300  1   13  36  1,12
    1    31316     Lennon, John       25  F01      300  1   13  37  1,13
    

    This is a little ugly, but maybe someone will chime in with a prettier solution.

    0 讨论(0)
提交回复
热议问题