问题
I've got a df with messages from a WhatsApp chat, the sender and the corresponding time in datetime format.
Time | Sender | Message |
---|---|---|
2020-12-21 22:23:00 | Sender 1 | "..." |
2020-12-21 22:26:00 | Sender 2 | "..." |
2020-12-21 22:35:00 | Sender 1 | "..." |
I can plot the histogram with sns.histplot(df["Time"], bins=48)
But now the ticks on the x-axis don't make much sense. I end up with 30 ticks even though it should be 24 and also the ticks all contain the whole date plus the time where I would want only the time in "%H:%M"
Where is the issue with the wrong ticks coming from?
Thanks!
回答1:
Both seaborn and pandas use matplotlib for plotting functions. Let's see who returns the bin values, we would need to adapt the x-ticks:
import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(15, 5))
#fake data generation
np.random.seed(1234)
n=20
start = pd.to_datetime("2020-11-15")
df = pd.DataFrame({"Time": pd.to_timedelta(np.random.rand(n), unit="D") + start, "A": np.random.randint(1, 100, n)})
#print(df)
#pandas histogram plotting function, left
pd_g = df["Time"].hist(bins=5, xrot=90, ax=ax1)
#no bin information
print(pd_g)
ax1.set_title("Pandas")
#seaborn histogram plotting, middle
sns_g = sns.histplot(df["Time"], bins=5, ax=ax2)
ax2.tick_params(axis="x", labelrotation=90)
#no bin information
print(sns_g)
ax2.set_title("Seaborn")
#matplotlib histogram, right
mpl_g = ax3.hist(df["Time"], bins=5, edgecolor="white")
ax3.tick_params(axis="x", labelrotation=90)
#hooray, bin information, alas in floats representing dates
print(mpl_g)
ax3.set_title("Matplotlib")
plt.tight_layout()
plt.show()
Sample output:
From this exercise we can conclude that all three refer to the same routine. So, we can directly use matplotlib which provides us with the bin values:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib.dates import num2date
fig, ax = plt.subplots(figsize=(8, 5))
#fake data generation
np.random.seed(1234)
n=20
start = pd.to_datetime("2020-11-15")
df = pd.DataFrame({"Time": pd.to_timedelta(np.random.rand(n), unit="D") + start, "A": np.random.randint(1, 100, n)})
#plots histogram, returns counts, bin border values, and the bars themselves
h_vals, h_bins, h_bars = ax.hist(df["Time"], bins=5, edgecolor="white")
#plot x ticks at the place where the bin borders are
ax.set_xticks(h_bins)
#label them with dates in HH:MM format after conversion of the float values that matplotlib uses internally
ax.set_xticklabels([num2date(curr_bin).strftime("%H:%M") for curr_bin in h_bins])
plt.show()
Sample output:
Seaborn and pandas make life easier because they provide convenience wrappers and some additional functionality for commonly used plotting functions. However, if they do not suffice in the parameters they provide, one has often to revert to matplotlib which is more flexible in what it can do. Obviously, there might be an easier way in pandas or seaborn, I am not aware of. I will happily upvote any better suggestion within these libraries.
来源:https://stackoverflow.com/questions/65391372/axis-ticks-in-histogram-of-times-in-matplotlib-seaborn