Pandas pivot table ValueError: Index contains duplicate entries, cannot reshape

后端 未结 2 927
死守一世寂寞
死守一世寂寞 2021-01-03 03:37

I have a dataframe as shown below (top 3 rows):

Sample_Name Sample_ID   Sample_Type IS  Component_Name  IS_Name Component_Group_Name    Outlier_Reasons Actua         


        
2条回答
  •  时光说笑
    2021-01-03 04:08

    You should be able to accomplish what you are looking to do by using the the pandas.pivot_table() functionality as documented here.

    With your dataframe stored as df use the following code:

    import pandas as pd
    df = pd.read_table('table_from_which_to_read')
    
    new_df = pd.pivot_table(df,index=['Simple Name'], columns = 'Component_Name', values = "Calculated_Concentration")
    

    If you want something other than the mean of the concentration value, you will need to change the aggfunc parameter.

    EDIT

    Since you don't want to aggregate over the values, you can reshape the data by using the set_index function on your DataFrame with documentation found here.

    import pandas as pd
    df = pd.DataFrame({'NonUniqueLabel':['Item1','Item1','Item1','Item2'],
         'SemiUniqueValue':['X','Y','Z','X'], 'Value':[1.0,100,5,None])
    
    new_df = df.set_index(['NonUniqueLabel','SemiUniqueLabel'])
    

    The resulting table should look like what you expect the results to be and will have a multi-index.

提交回复
热议问题