Groupby and append lists and strings

后端 未结 2 1255
被撕碎了的回忆
被撕碎了的回忆 2021-01-12 02:43

I am trying to group-by the values in my \"value_1\" column. But my last column is made up of lists. When I try to group-by using my \"value_1\" column, the column made up o

2条回答
  •  执笔经年
    2021-01-12 03:27

    Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:

    f1 = lambda x: ', '.join(x.dropna())
    #alternative for join only strings
    #f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
    f2 = lambda x: [z for y in x for z in y]
    d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
    d['list'] = f2 
    
    df = df.groupby('value_1', as_index=False).agg(d)
    print (df)
         value_1                 value_2                value_3  \
    0   american  california, nyc, texas         walmart, kmart   
    1   canadian                 toronto  dunkinDonuts, walmart   
    
                                   list  
    0  [supermarket, connivence, state]  
    1             [coffee, supermarket]  
    

    Explanation:

    f1 and f2 are lambda functions.

    First remove missing values (if exist) and join strings with separator:

    f1 = lambda x: ', '.join(x.dropna())
    

    First get only strings values (omit missing values, because NaNs) and join strings with separator:

    f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
    

    First get all string values with filtering empty strings and join strings with separator:

    f1 = lambda x: ', '.join([y for y in x if y != '']) 
    

    Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]

    f2 = lambda x: [z for y in x for z in y]
    

提交回复
热议问题