Shift rows in pandas dataframe in a specific order

问题

I have a pandas dataframe which looks like this:

df = pd.DataFrame({
     'job': ['football','football', 'football', 'basketball', 'basketball', 'basketball', 'hokey', 'hokey', 'hokey', 'football','football', 'football', 'basketball', 'basketball', 'basketball', 'hokey', 'hokey', 'hokey'],
     'team': [4.0,5.0,9.0,2.0,3.0,6.0,1.0,7.0,8.0, 4.0,5.0,9.0,2.0,3.0,6.0,1.0,7.0,8.0],
     'cluster': [0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1]
     })

Each cluster contains 9 teams. Each cluster has 3 teams of each type of sport football, basketball and hokey. I want to apply a shift-function to each cluster, so that the order of teams chance in a very specific way (I tried to highlight it with color):

How can I do this transformation (shift rows in a way shown above) for a much larger dataframe?

回答1:

Let's do groupby + cumcount to create a sequential counter based on the columns cluster and job then use sort_values to sort the dataframe on cluster and this counter:

df['j'] = df.groupby(['cluster', 'job']).cumcount()
df = df.sort_values(['cluster', 'j'], ignore_index=True).drop('j', axis=1)

           job  team  cluster
0     football   4.0        0
1   basketball   2.0        0
2        hokey   1.0        0
3     football   5.0        0
4   basketball   3.0        0
5        hokey   7.0        0
6     football   9.0        0
7   basketball   6.0        0
8        hokey   8.0        0
9     football   4.0        1
10  basketball   2.0        1
11       hokey   1.0        1
12    football   5.0        1
13  basketball   3.0        1
14       hokey   7.0        1
15    football   9.0        1
16  basketball   6.0        1
17       hokey   8.0        1

来源：https://stackoverflow.com/questions/64194412/shift-rows-in-pandas-dataframe-in-a-specific-order

标签

python

pandas

dataframe

row

shift