I have a dataframe:
df
ID 0 1 2 3 4 ....
1 10 20 5 1 2 ....
2 3 4 NaN 10 1 ....
And I need to transpose the cell values
Another way using melt
and pd.crosstab
df1 = df.melt('ID')
df_final = pd.crosstab(index=df1.ID, columns=df1.value).reset_index()
Out[673]:
value ID 1.0 2.0 3.0 4.0 5.0 10.0 20.0
0 1 1 1 0 0 1 1 1
1 2 1 0 1 1 0 1 0
Note: default counting of pd.crosstab
uses frequency. Therefore, duplicate values will count as their frequencies. If you want only 1/0
indicator, just chain ge(1)
and astype
as follows
pd.crosstab(index=df1.ID, columns=df1.value).ge(1).astype(int).reset_index()
Use DataFrame.set_index with DataFrame.stack for remove missing values, then create indicators by get_dummies and return 1/0
by max
by first level, last convert columns to integers:
df1 = (pd.get_dummies(df.set_index('ID').stack())
.max(level=0)
.rename(columns=int)
.reset_index())
print (df1)
ID 1 2 3 4 5 10 20
0 1 1 1 0 0 1 1 1
1 2 1 0 1 1 0 1 0
EDIT:
print (df)
ID 0 1 2 3 4 5
0 1 10 20 5.0 1 2 5
1 2 3 4 NaN 10 1 2
If use max
then always in output are 0/1
values (check 5 column):
df1 = (pd.get_dummies(df.set_index('ID').stack())
.max(level=0)
.rename(columns=int)
.reset_index())
print (df1)
ID 1 2 3 4 5 10 20
0 1 1 1 0 0 1 1 1
1 2 1 1 1 1 0 1 0
But if use sum
it count values (check 5 column):
df2 = (pd.get_dummies(df.set_index('ID').stack())
.sum(level=0)
.rename(columns=int)
.reset_index())
print (df2)
ID 1 2 3 4 5 10 20
0 1 1 1 0 0 2 1 1
1 2 1 1 1 1 0 1 0