Pandas groupby multiple columns, list of multiple columns

前端 未结 4 883
庸人自扰
庸人自扰 2020-12-25 08:22

I have the following data:

Invoice NoStockCode Description                         Quantity    CustomerID  Country
536365  85123A      WHITE HANGING HEART T-         


        
相关标签:
4条回答
  • 2020-12-25 08:42

    Try using a variation of the following:

    df.groupby('company').product.agg([('count', 'count'), ('NoStockCode', ', '.join), ('Descrption', ', '.join), ('Quantity', ', '.join)])
    
    0 讨论(0)
  • 2020-12-25 08:53

    I can't reproduce your code right now, but I think that:

    print (df.groupby(['InvoiceNo','CustomerID','Country'], 
                      as_index=False)['NoStockCode','Description','Quantity']
              .agg(lambda x: list(x)))
    

    would give you the expected output

    0 讨论(0)
  • 2020-12-25 08:54

    You could use pd.pivot_table with aggfunc=list:

    import pandas as pd
    df = pd.DataFrame({'Country': ['United Kingdom', 'United Kingdom', 'United Kingdom'],
                       'CustomerID': [17850, 17850, 17850],
                       'Description': ['WHITE HANGING HEART T-LIGHT HOLDER',
                                       'WHITE METAL LANTERN',
                                       'CREAM CUPID HEARTS COAT HANGER'],
                       'Invoice': [536365, 536365, 536365],
                       'NoStockCode': ['85123A', '71053', '84406B'],
                       'Quantity': [6, 6, 8]})
    
    result = pd.pivot_table(df, index=['Invoice','CustomerID','Country'], 
                            values=['NoStockCode','Description','Quantity'], 
                            aggfunc=lambda x: ', '.join(map(str, x)))
    print(result)
    

    yields

                                                                             Description            NoStockCode Quantity
    Invoice CustomerID Country                                                                                          
    536365  17850      United Kingdom  WHITE HANGING HEART T-LIGHT HOLDER, WHITE META...  85123A, 71053, 84406B  6, 6, 8
    

    Note that if Quantity are ints, you will need to convert them to strs before calling ', '.join. That is why map(str, x) was used above.

    0 讨论(0)
  • 2020-12-25 08:55

    IIUC

    df.groupby(['Invoice','CustomerID'],as_index=False)['Description','NoStockCode'].agg(','.join)
    Out[47]: 
       Invoice  CustomerID                                        Description  \
    0   536365       17850  WHITEHANGINGHEARTT-LIGHTHOLDER,WHITEMETALANTER...   
               NoStockCode  
    0  85123A,71053,84406B  
    
    0 讨论(0)
提交回复
热议问题