问题
import pandas as pd
index = pd.to_datetime(['2016-05-01', '2016-11-01', '2017-05-02'])
data = pd.DataFrame({'a': [1, 2, 3],
'b': [4, 5, 6]}, index=index)
ax = data.plot()
print(ax.get_xlim())
# Out: (736066.7, 736469.3)
Now, if we change the last date.
index = pd.to_datetime(['2016-05-01', '2016-11-01', '2017-05-01'])
data = pd.DataFrame({'a': [1, 2, 3],
'b': [4, 5, 6]}, index=index)
ax = data.plot()
print(ax.get_xlim())
# Out: (184.8, 189.2)
The first example seems consistent with the matplotlib docs:
Matplotlib represents dates using floating point numbers specifying the number of days since 0001-01-01 UTC, plus 1
Why does the second example return something seemingly completely different? I'm using pandas version 0.22.0 and matplotlib version 2.2.2.
回答1:
Pandas uses different units to represents dates and times on the axes, depending on the range of dates/times in use. This means that different locators are in use.
In the first case,
print(ax.xaxis.get_major_locator())
# Out: pandas.plotting._converter.PandasAutoDateLocator
in the second case
print(ax.xaxis.get_major_locator())
# pandas.plotting._converter.TimeSeries_DateLocator
You may force pandas to always use the PandasAutoDateLocator
using the x_compat
argument,
df.plot(x_compat=True)
This would ensure to always get the same datetime definition, consistent with the matplotlib.dates
convention.
The drawback is that this removes the nice quarterly ticking
and replaces it with the standard ticking
On the other hand it would then allow to use the very customizable matplotlib.dates
tickers and formatters. For example to get quarterly ticks/labels
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as mticker
import pandas as pd
index = pd.to_datetime(['2016-05-01', '2016-11-01', '2017-05-01'])
data = pd.DataFrame({'a': [1, 2, 3],
'b': [4, 5, 6]}, index=index)
ax = data.plot(x_compat=True)
# Quarterly ticks
ax.xaxis.set_major_locator(mdates.MonthLocator((1,4,7,10)))
# Formatting:
def func(x,pos):
q = (mdates.num2date(x).month-1)//3+1
tx = "Q{}".format(q)
if q == 1:
tx += "\n{}".format(mdates.num2date(x).year)
return tx
ax.xaxis.set_major_formatter(mticker.FuncFormatter(func))
plt.setp(ax.get_xticklabels(), rotation=0, ha="center")
plt.show()
回答2:
In the second example, if you look at the plots, rather than giving dates matplotlib is giving quarter values:
The dates in this case are exactly six months and therefore two quarters apart, which is presumably why you're seeing this behavior. While I can't find it in the docs, the numbers given by xlim in this case are consistent with being the number of quarters since the Unix Epoch (Jan. 1, 1970).
来源:https://stackoverflow.com/questions/50988126/inconsistent-internal-representation-of-dates-in-matplotlib-pandas