All possible permutations columns Pandas Dataframe within the same column

问题

I had a similar question using Postgres SQL, but I figured that this kind of task is really hard to do in Postgres, and I think python/pandas would make this a lot easier, although I still can't quite come up with the solution.

I now have a Pandas Dataframe which looks like this:

df={'planid' : ['A', 'A', 'B', 'B', 'C', 'C'],
    'x' : ['a1', 'a2', 'b1', 'b2', 'c1', 'c2']}

df=pd.DataFrame(df)

df


   planid   x
0   A       a1
1   A       a2
2   B       b1
3   B       b2
4   C       c1
5   C       c2

I want to get all possible permutations where planid are not equal to each other. In other words, think of each value in planid as a "bucket" and I want all possible combinations if I were to draw values from x from each "bucket" in planid. In this particular example, there are 8 total permutations {(a1, b1, c1), (a1, b2, c1), (a1, b1, c2), (a1, b2, c2), (a2, b1, c1), (a2, b2, c1), (a2, b1, c2), (a2, b2, c2)}.

However, I want my resulting data frame to be three columns, planid, x and another column, perhaps named permutation_counter. The final data frame has all the different permutations labeled with permutation_counter. In other words, I want my final dataframe to look like

       planid   x  permutation_counter
    0   A       a1     1
    1   B       b1     1
    2   C       c1     1 
    3   A       a1     2
    4   B       b2     2
    5   C       c1     2
    6   A       a1     3
    7   B       b1     3
    8   C       c2     3
    9   A       a1     4
    10  B       b2     4
    11  C       c2     4
    12  A       a2     5
    13  B       b1     5
    14  C       c1     5
    15  A       a2     6
    16  B       b2     6
    17  C       c1     6
    18  A       a2     7
    19  B       b1     7
    20  C       c2     7
    21  A       a2     8
    22  B       b2     8
    23  C       c2     8

Any help would be greatly appreciated!

回答1:

I was trying to chain as many steps together as possible. Break them down to see what each step does :)

df2 = pd.DataFrame(index=pd.MultiIndex.from_product([subdf['x'] for p, subdf in df.groupby('planid')], names=df.planid.unique())).reset_index().stack().reset_index()

df2.columns = ['permutation_counter', 'planid', 'x']
df2['permutation_counter'] += 1

print df2[['planid', 'x', 'permutation_counter']]

   planid   x  permutation_counter
0       A  a1                    1
1       B  b1                    1
2       C  c1                    1
3       A  a1                    2
4       B  b1                    2
5       C  c2                    2
6       A  a1                    3
7       B  b2                    3
8       C  c1                    3
9       A  a1                    4
10      B  b2                    4
11      C  c2                    4
12      A  a2                    5
13      B  b1                    5
14      C  c1                    5
15      A  a2                    6
16      B  b1                    6
17      C  c2                    6
18      A  a2                    7
19      B  b2                    7
20      C  c1                    7
21      A  a2                    8
22      B  b2                    8
23      C  c2                    8

回答2:

@Happy001 beat me by a couple of minutes but I'll go ahead and post this anyway because I think it's a little easier to follow:

import numpy as np
import pandas as pd
import itertools

x = list( itertools.product( ['a1','b2'],['b1','b2'],['c1','c2'] ) )
x = list( itertools.chain(*x) )
df = pd.DataFrame({ 'planid'  : np.tile( list('ABC'), 8 ),
                    'x'       : x,
                    'p_count' : np.repeat( range(1,9), 3 ) })

results:

    p_count planid   x
0         1      A  a1
1         1      B  b1
2         1      C  c1
3         2      A  a1
4         2      B  b1
5         2      C  c2

...

21        8      A  b2
22        8      B  b2
23        8      C  c2

来源：https://stackoverflow.com/questions/35518308/all-possible-permutations-columns-pandas-dataframe-within-the-same-column

标签

python

pandas

permutation