I have been working with a dataframe in python and pandas that contains duplicate entries in the first column. The dataframe looks something like this:
sampl
Groupby will work.
data.groupby('sample_id').mean()
You can then use reset_index() to make look exactly as you want.
reset_index()
groupby the sample_id column and use mean
groupby
sample_id
mean
df.groupby('sample_id').mean().reset_index() or df.groupby('sample_id', as_index=False).mean()
df.groupby('sample_id').mean().reset_index()
df.groupby('sample_id', as_index=False).mean()
get you