Let\'s suppose I have following Time Series:
Timestamp Category
2014-10-16 15:05:17 Facebook
2014-10-16 14:56:37 Vimeo
2014-10-16 14:25:16
To be a little bit more clear, you do not need to create a new column called 'week_num' first.
df.groupby(by=lambda x: "%d/%d" % (x.week(), x.year())).Category.value_counts()
The function by will automatically call on each timestamp object of the index to convert them to week and year, and then group by the week and year.
It might be easiest to turn your Series into a DataFrame and use Pandas' groupby
functionality (if you already have a DataFrame then skip straight to adding another column below).
If your Series is called s
, then turn it into a DataFrame like so:
>>> df = pd.DataFrame({'Timestamp': s.index, 'Category': s.values})
>>> df
Category Timestamp
0 Facebook 2014-10-16 15:05:17
1 Vimeo 2014-10-16 14:56:37
2 Facebook 2014-10-16 14:25:16
...
Now add another column for the week and year (one way is to use apply
and generate a string of the week/year numbers):
>>> df['Week/Year'] = df['Timestamp'].apply(lambda x: "%d/%d" % (x.week, x.year))
>>> df
Timestamp Category Week/Year
0 2014-10-16 15:05:17 Facebook 42/2014
1 2014-10-16 14:56:37 Vimeo 42/2014
2 2014-10-16 14:25:16 Facebook 42/2014
...
Finally, group by 'Week/Year'
and 'Category'
and aggregate with size()
to get the counts. For the data in your question this produces the following:
>>> df.groupby(['Week/Year', 'Category']).size()
Week/Year Category
41/2014 DailyMotion 1
Facebook 3
Vimeo 2
Youtube 3
42/2014 Facebook 7
Orkut 1
Vimeo 1
Convert your TimeStamp column to week number then groupby that week number and value_count
the categorical variable like so:
df.groupby('week_num').Category.value_counts()
Where I have assumed that a new column week_num
was created from the TimeStamp column.