How to split a DataFrame on each different value in a column?

前端 未结 3 1021
萌比男神i
萌比男神i 2021-01-23 10:16

Below is an example DataFrame.

      0      1     2     3          4
0   0.0  13.00  4.50  30.0   0.0,13.0
1   0.0  13.00  4.75  30.0   0.0,13.0
2   0.0  13.00           


        
相关标签:
3条回答
  • 2021-01-23 11:05

    Based on

    I want to split this into new dataframes when the row in column 0 changes.

    If you only want to group when value in column 0 changes , You can try:

    d=dict([*df.groupby(df['0'].ne(df['0'].shift()).cumsum())])
    
    print(d[1])
    print(d[2])
    

         0     1     2     3         4
    0  0.0  13.0  4.50  30.0  0.0,13.0
    1  0.0  13.0  4.75  30.0  0.0,13.0
    2  0.0  13.0  5.00  30.0  0.0,13.0
    3  0.0  13.0  5.25  30.0  0.0,13.0
    4  0.0  13.0  5.50  30.0  0.0,13.0
    5  0.0  13.0  5.75   0.0  0.0,13.0
    6  0.0  13.0  6.00  30.0  0.0,13.0
          0      1     2     3          4
    7   1.0  13.25  0.00  30.0  0.0,13.25
    8   1.0  13.25  0.25   0.0  0.0,13.25
    9   1.0  13.25  0.50  30.0  0.0,13.25
    10  1.0  13.25  0.75  30.0  0.0,13.25
    
    0 讨论(0)
  • 2021-01-23 11:08

    I will use GroupBy.__iter__:

    d = dict(df.groupby(df['0'].diff().ne(0).cumsum()).__iter__())
    #d = dict(df.groupby(df[0].diff().ne(0).cumsum()).__iter__())
    

    Note that if there are repeated non-consecutive values ​​different groups will be created, if you only use groupby(0) they will be grouped in the same group

    0 讨论(0)
  • 2021-01-23 11:12

    Looks like you want to groupby the first colum. You could create a dictionary from the groupby object, and have the groupby keys be the dictionary keys:

    out = dict(tuple(df.groupby(0)))
    

    Or we could also build a list from the groupby object. This becomes more useful when we only want positional indexing rather than based on the grouping key:

    out = [sub_df for _, sub_df in df.groupby(0)]
    

    We could then index the dict based on the grouping key, or the list based on the group's position:

    print(out[0])
    
        0     1     2     3         4
    0  0.0  13.0  4.50  30.0  0.0,13.0
    1  0.0  13.0  4.75  30.0  0.0,13.0
    2  0.0  13.0  5.00  30.0  0.0,13.0
    3  0.0  13.0  5.25  30.0  0.0,13.0
    4  0.0  13.0  5.50  30.0  0.0,13.0
    5  0.0  13.0  5.75   0.0  0.0,13.0
    6  0.0  13.0  6.00  30.0  0.0,13.0
    
    0 讨论(0)
提交回复
热议问题