问题
I have data points whose abscissas are datetime.datetime
objects with a time zone (their tzinfo
happens to be a bson.tz_util.FixedOffset
obtained through MongoDB).
When I plot them with scatter()
, what is the time zone of the tick labels?
Changing the timezone
in matplotlibrc
does not change anything in the displayed plot (I must have misunderstood the discussion on time zones in the Matplotlib documentation).
I experimented a little with plot()
(instead of scatter()
). When given a single date, it plots it and ignores the time zone. However, when given multiple dates, it uses a fixed time zone, but how is it determined? I can't find anything in the documentation.
Finally, is plot_date()
supposed to be the solution to these time zone problems?
回答1:
The question was already answered in the comments sort of. However I was still struggling with timezones myself. To get it clear I tried all combinations. I think you have two main approaches depending on if your datetime objects are already in the desired timezone or are in a different timezone, I tried to describe them below. It's possible that I still missed/mixed something..
Timestamps (datetime objects): in UTC Desired display: in specific timezone
- Set the xaxis_date() to your desired display timezone (defaults to
rcParam['timezone']
which was UTC for me)
Timestamps (datetime objects): in a specific timezone Desired display: in a different specific timezone
- Feed your plot function datetime objects with the corresponding timezone (
tzinfo=
) - Set the rcParams['timezone'] to your desired display timezone
- Use a dateformatter (even if you are satisfied with the format, the formatter is timezone aware)
If you are using plot_date() you can also pass in the tz keyword but for a scatter plot this is not possible.
When your source data contains unix timestamps, be sure to choose wisely from datetime.datetime.utcfromtimestamp()
and without utc: fromtimestamp()
if you are going to use matplotlib timezone capabilities.
This is the experimenting I did (on scatter() in this this case), it's a bit hard to follow maybe, but just written here for anyone who would care. Notice at what time the first dots appear (the x axis does not start on the same time for each subplot):
Sourcecode:
import time,datetime,matplotlib
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.dates as mdates
from dateutil import tz
#y
data = np.array([i for i in range(24)])
#create a datetime object from the unix timestamp 0 (epoch=0:00 1 jan 1970 UTC)
start = datetime.datetime.fromtimestamp(0)
# it will be the local datetime (depending on your system timezone)
# corresponding to the epoch
# and it will not have a timezone defined (standard python behaviour)
# if your data comes as unix timestamps and you are going to work with
# matploblib timezone conversions, you better use this function:
start = datetime.datetime.utcfromtimestamp(0)
timestamps = np.array([start + datetime.timedelta(hours=i) for i in range(24)])
# now add a timezone to those timestamps, US/Pacific UTC -8, be aware this
# will not create the same set of times, they do not coincide
timestamps_tz = np.array([
start.replace(tzinfo=tz.gettz('US/Pacific')) + datetime.timedelta(hours=i)
for i in range(24)])
fig = plt.figure(figsize=(10.0, 15.0))
#now plot all variations
plt.subplot(711)
plt.scatter(timestamps, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().set_title("1 - tzinfo NO, xaxis_date = NO, formatter=NO")
plt.subplot(712)
plt.scatter(timestamps_tz, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().set_title("2 - tzinfo YES, xaxis_date = NO, formatter=NO")
plt.subplot(713)
plt.scatter(timestamps, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().xaxis_date('US/Pacific')
plt.gca().set_title("3 - tzinfo NO, xaxis_date = YES, formatter=NO")
plt.subplot(714)
plt.scatter(timestamps, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().xaxis_date('US/Pacific')
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M(%d)'))
plt.gca().set_title("4 - tzinfo NO, xaxis_date = YES, formatter=YES")
plt.subplot(715)
plt.scatter(timestamps_tz, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().xaxis_date('US/Pacific')
plt.gca().set_title("5 - tzinfo YES, xaxis_date = YES, formatter=NO")
plt.subplot(716)
plt.scatter(timestamps_tz, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().set_title("6 - tzinfo YES, xaxis_date = NO, formatter=YES")
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M(%d)'))
plt.subplot(717)
plt.scatter(timestamps_tz, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().xaxis_date('US/Pacific')
plt.gca().set_title("7 - tzinfo YES, xaxis_date = YES, formatter=YES")
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M(%d)'))
fig.tight_layout(pad=4)
plt.subplots_adjust(top=0.90)
plt.suptitle(
'Matplotlib {} with rcParams["timezone"] = {}, system timezone {}"
.format(matplotlib.__version__,matplotlib.rcParams["timezone"],time.tzname))
plt.show()
回答2:
If, like me, you are coming to this question while trying to get a timezone-aware pandas DataFrame to plot correctly, @pseyfert 's comment to use a formatter with timezone is also right on the money. Here is an example for pandas.plot
, showing some points while transitioning from EST to EDT:
df = pd.DataFrame(
dict(y=np.random.normal(size=5)),
index=pd.DatetimeIndex(
start='2018-03-11 01:30',
freq='15min',
periods=5,
tz=pytz.timezone('US/Eastern')))
Notice how the timezone changes as we transition to daylight savings:
> [f'{t:%T %Z}' for t in df.index]
['01:30:00 EST',
'01:45:00 EST',
'03:00:00 EDT',
'03:15:00 EDT',
'03:30:00 EDT']
Now, plot it:
df.plot(style='-o')
formatter = mdates.DateFormatter('%m/%d %T %Z', tz=df.index.tz)
plt.gca().xaxis.set_major_formatter(formatter)
plt.show()
PS:
Not sure why some of the dates (the EST ones) look like they are in bold, but presumably the internals of matplotlib renders the labels more than once and the position changes by one pixel or two... The following confirms that the formatter is called several times for the same timestamps:
class Foo(mdates.DateFormatter):
def __init__(self, *args, **kwargs):
super(Foo, self).__init__(*args, **kwargs)
def strftime(self, dt, fmt=None):
s = super(Foo, self).strftime(dt, fmt=fmt)
print(f'out={s} for dt={dt}, fmt={fmt}')
return s
And check out the output of:
df.plot(style='-o')
formatter = Foo('%F %T %Z', tz=df.index.tz)
plt.gca().xaxis.set_major_formatter(formatter)
plt.show()
来源:https://stackoverflow.com/questions/22255356/how-to-handle-times-with-a-time-zone-in-matplotlib