Zero occurrences/frequency using value_counts() in PANDAS

前端 未结 2 1546
我寻月下人不归
我寻月下人不归 2021-01-08 00:16

I have a table containing dates and the various cars sold on each dates in the following format (These are only 2 of many columns):

DATE       CAR
2012/01/01         


        
相关标签:
2条回答
  • 2021-01-08 00:24

    The default behavior of type category is exactly what you want. The non present categories will display with a value of zero. You just need to do:

    df.astype({'CAR': 'category'})[df.CAR=='BMW']['DATE'].value_counts()
    

    or better yet, make it definitively a category in your dataframe:

    df.CAR = df.CAR.astype('category')
    df[df.CAR=='BMW'].DATE.value_counts()
    

    The category type is a better representation of your data and more space-efficient.

    0 讨论(0)
  • 2021-01-08 00:50

    You can reindex the result after value_counts and fill the missing values with 0.

    df.loc[df.CAR == 'BMW', 'DATE'].value_counts().reindex(
        df.DATE.unique(), fill_value=0)
    

    Output:

    2012/01/01    2
    2012/01/02    1
    2012/01/03    0
    2012/09/01    1
    2012/09/02    0
    Name: DATE, dtype: int64
    

    Instead of value_counts you could also consider checking the equality and summing, grouped by the dates, which will include all of them.

    df['CAR'].eq('BMW').astype(int).groupby(df['DATE']).sum()
    

    Output:

    DATE
    2012/01/01    2
    2012/01/02    1
    2012/01/03    0
    2012/09/01    1
    2012/09/02    0
    Name: CAR, dtype: int32
    
    0 讨论(0)
提交回复
热议问题