Split dataframe into relatively even chunks according to length

后端 未结 2 1910
遇见更好的自我
遇见更好的自我 2020-12-01 10:30

I have to create a function which would split provided dataframe into chunks of needed size. For instance if dataframe contains 1111 rows, I want to be able to specify chunk

相关标签:
2条回答
  • 2020-12-01 11:14

    You can take the floor division of a sequence up to the amount of rows in the dataframe, and use it to groupby splitting the dataframe into equally sized chunks:

    n = 400
    for g, df in test.groupby(np.arange(len(test)) // n):
        print(df.shape)
    # (400, 2)
    # (400, 2)
    # (311, 2)
    
    0 讨论(0)
  • 2020-12-01 11:26

    A more pythonic way to break large dataframes into smaller chunks based on fixed number of rows is to use list comprehension:

    n = 400  #chunk row size
    list_df = [test[i:i+n] for i in range(0,test.shape[0],n)]
    
    [i.shape for i in list_df]
    

    Output:

    [(400, 2), (400, 2), (311, 2)]
    
    0 讨论(0)
提交回复
热议问题