Counting frequency of values by date using pandas

后端 未结 3 752
一个人的身影
一个人的身影 2020-12-28 16:00

Let\'s suppose I have following Time Series:

Timestamp              Category
2014-10-16 15:05:17    Facebook
2014-10-16 14:56:37    Vimeo
2014-10-16 14:25:16         


        
相关标签:
3条回答
  • 2020-12-28 16:24

    To be a little bit more clear, you do not need to create a new column called 'week_num' first.

    df.groupby(by=lambda x: "%d/%d" % (x.week(), x.year())).Category.value_counts()
    

    The function by will automatically call on each timestamp object of the index to convert them to week and year, and then group by the week and year.

    0 讨论(0)
  • 2020-12-28 16:32

    It might be easiest to turn your Series into a DataFrame and use Pandas' groupby functionality (if you already have a DataFrame then skip straight to adding another column below).

    If your Series is called s, then turn it into a DataFrame like so:

    >>> df = pd.DataFrame({'Timestamp': s.index, 'Category': s.values})
    >>> df
           Category           Timestamp
    0      Facebook 2014-10-16 15:05:17
    1         Vimeo 2014-10-16 14:56:37
    2      Facebook 2014-10-16 14:25:16
    ...
    

    Now add another column for the week and year (one way is to use apply and generate a string of the week/year numbers):

    >>> df['Week/Year'] = df['Timestamp'].apply(lambda x: "%d/%d" % (x.week, x.year))
    >>> df
                 Timestamp     Category Week/Year
    0  2014-10-16 15:05:17     Facebook   42/2014
    1  2014-10-16 14:56:37        Vimeo   42/2014
    2  2014-10-16 14:25:16     Facebook   42/2014
    ...
    

    Finally, group by 'Week/Year' and 'Category' and aggregate with size() to get the counts. For the data in your question this produces the following:

    >>> df.groupby(['Week/Year', 'Category']).size()
    Week/Year  Category   
    41/2014    DailyMotion    1
               Facebook       3
               Vimeo          2
               Youtube        3
    42/2014    Facebook       7
               Orkut          1
               Vimeo          1
    
    0 讨论(0)
  • 2020-12-28 16:35

    Convert your TimeStamp column to week number then groupby that week number and value_count the categorical variable like so:

    df.groupby('week_num').Category.value_counts()
    

    Where I have assumed that a new column week_num was created from the TimeStamp column.

    0 讨论(0)
提交回复
热议问题