I currently have a pandas Series
with dtype Timestamp
, and I want to group it by date (and have many rows with different times in each group).
Three methods:
DataFrame: pd.groupby(['column']).size()
Series: sel.groupby(sel).size()
Series to DataFrame:
pd.DataFrame( sel, columns=['column']).groupby(['column']).size()
You should convert it to a DataFrame, then add a column that is the date(). You can do groupby on the DataFrame with the date column.
df = pandas.DataFrame(s, columns=["datetime"])
df["date"] = df["datetime"].apply(lambda x: x.date())
df.groupby("date")
Then "date" becomes your index. You have to do it this way because the final grouped object needs an index so you can do things like select a group.
To add another suggestion, I often use the following as it uses simple logic:
pd.Series(index=s.values).groupby(level=0)
grouped = s.groupby(s)
Or:
grouped = s.groupby(lambda x: s[x])
For anyone else who wants to do this inline without throwing a lambda in (which tends to kill performance):
s.to_frame(0).groupby(0)[0]