In the data I am working with the index is compound - i.e. it has both item name and a timestamp, e.g. name@domain.com|2013-05-07 05:52:51 +0200
.
I want to
My preference would be to initially read this in as a column (i.e. not as an index), then you can use the str split method:
csv = '\n'.join(['name@domain.com|2013-05-07 05:52:51 +0200, 42'] * 3)
df = pd.read_csv(StringIO(csv), header=None)
In [13]: df[0].str.split('|')
Out[13]:
0 [name@domain.com, 2013-05-07 05:52:51 +0200]
1 [name@domain.com, 2013-05-07 05:52:51 +0200]
2 [name@domain.com, 2013-05-07 05:52:51 +0200]
Name: 0, dtype: object
And then feed this into a MultiIndex (perhaps this can be done cleaner?):
m = pd.MultiIndex.from_arrays(zip(*df[0].str.split('|')))
Delete the 0th column and set the index to the new MultiIndex:
del df[0]
df.index = m
In [17]: df
Out[17]:
1
name@domain.com 2013-05-07 05:52:51 +0200 42
2013-05-07 05:52:51 +0200 42
2013-05-07 05:52:51 +0200 42