Pandas Split DataFrame using row index

后端 未结 4 1246
日久生厌
日久生厌 2021-01-18 15:08

I want to split dataframe by uneven number of rows using row index.

The below code:

groups = df.groupby((np.arange(len(df.index))/l[1]).astype(int)         


        
4条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-01-18 15:48

    You can create an array to use for indexing via NumPy:

    import pandas as pd, numpy as np
    
    df = pd.DataFrame(np.arange(24).reshape((8, 3)), columns=list('abc'))
    
    L = [2, 5, 7]
    idx = np.cumsum(np.in1d(np.arange(len(df.index)), L))
    
    for _, chunk in df.groupby(idx):
        print(chunk, '\n')
    
       a  b  c
    0  0  1  2
    1  3  4  5 
    
        a   b   c
    2   6   7   8
    3   9  10  11
    4  12  13  14 
    
        a   b   c
    5  15  16  17
    6  18  19  20 
    
        a   b   c
    7  21  22  23 
    

    Instead of defining a new variable for each dataframe, you can use a dictionary:

    d = dict(tuple(df.groupby(idx)))
    
    print(d[1])  # print second groupby value
    
        a   b   c
    2   6   7   8
    3   9  10  11
    4  12  13  14
    

提交回复
热议问题