I\'m trying to collapse rows in a dataframe that contains a column of ID data and a number of columns that each hold a different string. It looks like groupby is the solution, b
Assuming blanks are ''
option 1
pivot_table
df.pivot_table(['apples', 'pears', 'oranges'], 'ID', aggfunc=''.join)
option 2
sort
and take last row as ''
will be sorted first
def f(df):
return pd.DataFrame(np.sort(df.values, 0)[[-1]], [df.name], df.columns)
df.set_index(
'ID', append=True
).groupby(level='ID', group_keys=False).apply(f)
Both yield
apples oranges pears
ID
101 oranges
134 apples pears
576 oranges pears
837 apples