I have a pandas dataframe
in which one column of text strings contains comma-separated values. I want to split each CSV field and create a new row per entry (as
One-liner using split(___, expand=True)
and the level
and name
arguments to reset_index()
:
>>> b = a.var1.str.split(',', expand=True).set_index(a.var2).stack().reset_index(level=0, name='var1')
>>> b
var2 var1
0 1 a
1 1 b
2 1 c
0 2 d
1 2 e
2 2 f
If you need b
to look exactly like in the question, you can additionally do:
>>> b = b.reset_index(drop=True)[['var1', 'var2']]
>>> b
var1 var2
0 a 1
1 b 1
2 c 1
3 d 2
4 e 2
5 f 2