Group by consecutive index numbers

前端 未结 6 1973
忘掉有多难
忘掉有多难 2021-01-03 19:00

I was wondering if there is a way to groupby consecutive index numbers and move the groups in different columns. Here is an example of the DataFrame I\'m using:



        
相关标签:
6条回答
  • 2021-01-03 19:36

    Create a new pandas.Series with a new pandas.MultiIndex

    a = pd.factorize(df.index - np.arange(len(df)))[0]
    b = df.groupby(a).cumcount()
    
    pd.Series(df['0'].to_numpy(), [b, a]).unstack()
    
                  0             1
    0  19218.965703  19279.216956
    1  19247.621650  19330.087371
    2  19232.651322  19304.316973
    

    Similar but with more Numpy

    a = pd.factorize(df.index - np.arange(len(df)))[0]
    b = df.groupby(a).cumcount()
    
    c = np.empty((b.max() + 1, a.max() + 1), float)
    c.fill(np.nan)
    c[b, a] = np.ravel(df)
    pd.DataFrame(c)
    
                  0             1
    0  19218.965703  19279.216956
    1  19247.621650  19330.087371
    2  19232.651322  19304.316973
    
    0 讨论(0)
  • 2021-01-03 19:37

    This is a groupby + pivot_table


    m = df.index.to_series().diff().ne(1).cumsum()
    
    (df.assign(key=df.groupby(m).cumcount())
        .pivot_table(index='key', columns=m, values=0))
    

                    1             2
    key
    0    19218.965703  19279.216956
    1    19247.621650  19330.087371
    2    19232.651322  19304.316973
    
    0 讨论(0)
  • 2021-01-03 19:37

    My way:

    df['groups']=list(df.reset_index()['index']-range(0,len(df)))
    pd.concat([df[df['groups']==i][['0']].reset_index(drop=True) for i in df['groups'].unique()],axis=1)
    
                  0             0
    0  19218.965703  19279.216956
    1  19247.621650  19330.087371
    2  19232.651322  19304.316973
    
    0 讨论(0)
  • 2021-01-03 19:42

    Here is one way:

    from more_itertools import consecutive_groups
    final=pd.concat([df.loc[i].reset_index(drop=True) 
                        for i in consecutive_groups(df.index)],axis=1)
    final.columns=range(len(final.columns))
    print(final)
    

                  0             1
    0  19218.965703  19279.216956
    1  19247.621650  19330.087371
    2  19232.651322  19304.316973
    
    0 讨论(0)
  • 2021-01-03 19:47

    One way from pandas groupby

    s=df.index.to_series().diff().ne(1).cumsum()
    pd.concat({x: y.reset_index(drop=True) for x, y in df['0'].groupby(s)}, axis=1)
    
    Out[786]: 
                  1             2
    0  19218.965703  19279.216956
    1  19247.621650  19330.087371
    2  19232.651322  19304.316973
    
    0 讨论(0)
  • 2021-01-03 19:50

    I think that you have assumed that the number of observations within each consecutive group will be the same. My approach is:

    Prepare the data:

    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame(data ={'data':[19218.965703 ,19247.621650 ,19232.651322 ,19279.216956 ,19330.087371 ,19304.316973]}, index = [0,1,2,9,10,11] )
    

    And the solution:

    df['Group'] = (df.index.to_series()-np.arange(df.shape[0])).rank(method='dense')
    df.reset_index(inplace=True)
    df['Observations'] = df.groupby(['Group'])['index'].rank()
    df.pivot(index='Observations',columns='Group', values='data')
    

    Which returns:

    Group                  1.0           2.0
    Observations                            
    1.0           19218.965703  19279.216956
    2.0           19247.621650  19330.087371
    3.0           19232.651322  19304.316973
    
    0 讨论(0)
提交回复
热议问题