Converting Index into MultiIndex (hierarchical index) in Pandas

后端 未结 3 1253
广开言路
广开言路 2021-02-04 08:49

In the data I am working with the index is compound - i.e. it has both item name and a timestamp, e.g. name@domain.com|2013-05-07 05:52:51 +0200.

I want to

3条回答
  •  爱一瞬间的悲伤
    2021-02-04 09:08

    My preference would be to initially read this in as a column (i.e. not as an index), then you can use the str split method:

    csv = '\n'.join(['name@domain.com|2013-05-07 05:52:51 +0200, 42'] * 3)
    df = pd.read_csv(StringIO(csv), header=None)
    
    In [13]: df[0].str.split('|')
    Out[13]:
    0    [name@domain.com, 2013-05-07 05:52:51 +0200]
    1    [name@domain.com, 2013-05-07 05:52:51 +0200]
    2    [name@domain.com, 2013-05-07 05:52:51 +0200]
    Name: 0, dtype: object
    

    And then feed this into a MultiIndex (perhaps this can be done cleaner?):

    m = pd.MultiIndex.from_arrays(zip(*df[0].str.split('|')))
    

    Delete the 0th column and set the index to the new MultiIndex:

    del df[0]
    df.index = m
    
    In [17]: df
    Out[17]:
                                                1
    name@domain.com 2013-05-07 05:52:51 +0200  42
                    2013-05-07 05:52:51 +0200  42
                    2013-05-07 05:52:51 +0200  42
    

提交回复
热议问题