I have a Python Pandas DataFrame like this (UCSC schema for NCBI RefSeq):
chrom exonStart exonEnds name
chr1 100,200,300 110,210,310 gen1
c
I come up with this , by usingunstack
and stack
df.set_index(['chrom','name']).apply(lambda x : x.str.split(','),1).\
stack().apply(pd.Series).stack().unstack(-2).\
reset_index().drop('level_2',1)
Out[1201]:
chrom name exonStart exonEnds
0 chr1 gen1 100 110
1 chr1 gen1 200 210
2 chr1 gen1 300 310
3 chr1 gen2 500 600
4 chr1 gen2 700 800
5 chr2 gen3 50 55
6 chr2 gen3 60 65
7 chr2 gen3 70 75
8 chr2 gen3 80 85