Pandas: Transform dataframe to show if a combination of values exists in the orignal Dataframe

前端 未结 3 910
深忆病人
深忆病人 2021-01-13 07:55

I have a Dataframe that looks like this:

 | Col 1 | Col 2 | 
0|   A   |   2   |
1|   A   |   3   |
2|   B   |   1   |
3|   B   |   2   |

an

相关标签:
3条回答
  • 2021-01-13 07:55

    Here's a pivot solution:

    (df.pivot('Col 1', 'Col 2', 'Col 1').fillna(0) != 0).rename_axis(index=None, columns=None)
    
             1     2      3
    A      False  True   True
    B       True  True  False
    
    0 讨论(0)
  • 2021-01-13 08:02

    Use get_dummies with max:

    df = pd.get_dummies(df.set_index('Col 1')['Col 2'], dtype=bool).rename_axis(None).max(level=0)
    print (df)
           1     2      3
    A  False  True   True
    B   True  True  False
    

    Or if possible not missing values in column Col2 then use DataFrame.pivot with DataFrame.notna, for remove index and columns name use DataFrame.rename_axis:

    df = df.pivot('Col 1', 'Col 2', 'Col 1').notna().rename_axis(index=None, columns=None)
    print (df)
           1     2      3
    A  False  True   True
    B   True  True  False
    

    Alternative is possible duplicates and pivot failed:

    df = (df.pivot_table(index='Col 1', columns='Col 2', values='Col 1', aggfunc='size')
            .notna()
            .rename_axis(index=None, columns=None))
    print (df)
           1     2      3
    A  False  True   True
    B   True  True  False
    

    Or solution from comments:

    df = (pd.crosstab(df['Col 1'], df['Col 2'])
            .gt(0)
            .rename_axis(index=None, columns=None))
    
    0 讨论(0)
  • 2021-01-13 08:04

    You could use:

    df.groupby(['Col 1','Col 2']).size().unstack(fill_value=0).astype(bool)
    
    Col2      1     2      3
    Col1                    
    A     False  True   True
    B      True  True  False
    
    0 讨论(0)
提交回复
热议问题