How do I count the values from a pandas column which is a list of strings?

前端 未结 5 2014
名媛妹妹
名媛妹妹 2021-01-19 21:53

I have a dataframe column which is a list of strings:

df[\'colors\']

0              [\'blue\',\'green\',\'brown\']
1              []
2              [\'green\         


        
5条回答
  •  抹茶落季
    2021-01-19 22:22

    Solution

    Best option: df.colors.explode().dropna().value_counts().

    However, if you also want to have counts for empty lists ([]), use Method-1.B/C similar to what was suggested by Quang Hoang in the comments.

    You can use any of the following two methods.

    • Method-1: Use pandas methods alone ⭐⭐⭐

      explode --> dropna --> value_counts

    • Method-2: Use list.extend --> pd.Series.value_counts
    ## Method-1
    # A. If you don't want counts for empty []
    df.colors.explode().dropna().value_counts() 
    
    # B. If you want counts for empty [] (classified as NaN)
    df.colors.explode().value_counts(dropna=False) # returns [] as Nan
    
    # C. If you want counts for empty [] (classified as [])
    df.colors.explode().fillna('[]').value_counts() # returns [] as []
    
    ## Method-2
    colors = []
    _ = [colors.extend(e) for e in df.colors if len(e)>0]
    pd.Series(colors).value_counts()
    

    Output:

    green     2
    blue      2
    brown     2
    red       1
    purple    1
    # NaN     1  ## For Method-1.B
    # []      1  ## For Method-1.C
    dtype: int64
    

    Dummy Data

    import pandas as pd
    
    df = pd.DataFrame({'colors':[['blue','green','brown'],
                                 [],
                                 ['green','red','blue'],
                                 ['purple'],
                                 ['brown']]})
    

提交回复
热议问题