Pandas: how to convert a cell with multiple values to multiple rows?

后端 未结 2 1618
花落未央
花落未央 2021-02-06 04:59

I have a DataFrame like this:

Name asn  count
Org1 asn1,asn2 1
org2 asn3      2
org3 asn4,asn5 5

I would like to convert my DataFrame to look l

相关标签:
2条回答
  • 2021-02-06 05:24

    As an alternative:

    import pandas as pd
    from StringIO import StringIO
    
    ctn = '''Name asn count
    Org1 asn1,asn2 1
    org2 asn3      2
    org3 asn4,asn5 5'''
    
    df = pd.read_csv(StringIO(ctn), sep='\s*', engine='python')
    s = df['asn'].str.split(',').apply(pd.Series, 1).stack()
    s.index = s.index.droplevel(-1)
    s.name = 'asn'
    del df['asn']
    df = df.join(s)
    
    print df
    

    Result:

       Name  count   asn
    0  Org1      1  asn1
    0  Org1      1  asn2
    1  org2      2  asn3
    2  org3      5  asn4
    2  org3      5  asn5
    
    0 讨论(0)
  • 2021-02-06 05:47

    Carrying on from the same idea, you could set a MultiIndex for df2 and then stack. For example:

    >>> df2 = df.asn.str.split(',').apply(pd.Series)
    >>> df2.index = df.set_index(['Name', 'count']).index
    >>> df2.stack().reset_index(['Name', 'count'])
       Name  count     0
    0  Org1      1  asn1
    1  Org1      1  asn2
    0  org2      2  asn3
    0  org3      5  asn4
    1  org3      5  asn5
    

    You can then rename the column and set an index of your choosing.

    0 讨论(0)
提交回复
热议问题