问题
The following sequence is an extract of the pandas DataFrame that I've got:
>>> df_t
value
2011-01-31 -5.575000
2011-03-31 7.700000
2011-05-31 15.966667
2011-07-31 10.683333
2011-08-31 10.454167
2011-10-31 9.320833
2011-12-31 -0.358333
2012-01-31 -11.550000
2012-03-31 1.700000
2012-05-31 12.333333
2012-07-31 12.816667
2012-08-31 11.837500
2012-10-31 2.733333
2012-12-31 4.075000
2013-01-31 2.450000
2013-03-31 -4.262500
2013-05-31 11.491667
2013-07-31 14.812500
2013-08-31 13.920833
2013-10-31 4.125000
2013-12-31 0.075000
How can I delete March 31st in every leap year? I tried something like:
def isleap(year):
return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)
if isleap(df_t.index.year):
df_t=df_t[df_t.index.dayofyear!=91]
...but obviously, this was too straightforward in my head. Is the only solution to loop through the whole dataframe and check at every step if the year is a leap year and the date is 91st day of year or is there any easier solution available?
EDIT: The issue is not how to determine whether a year is a leap year, but, if so, to delete March 31st in the above dataframe.
回答1:
Here is an example to do that in a vectorized way. You shall note that and
and or
are not appropriate for a vector of booleans, use &
and |
instead.
import pandas as pd
import numpy as np
s = pd.Series(np.random.randn(600), index=pd.date_range('1990-01-01', periods=600, freq='M'))
Out[76]:
1990-01-31 -0.7594
1990-02-28 -0.1311
1990-03-31 1.2031
1990-04-30 1.1999
1990-05-31 -2.4399
...
2039-08-31 -0.3554
2039-09-30 -0.3265
2039-10-31 -0.3832
2039-11-30 -1.4139
2039-12-31 -0.3086
Freq: M, dtype: float64
def is_leap_and_MarchEnd(s):
return (s.index.year % 4 == 0) & ((s.index.year % 100 != 0) | (s.index.year % 400 == 0)) & (s.index.month == 3) & (s.index.day == 31)
mask = is_leap_and_MarchEnd(s)
s[mask]
Out[77]:
1992-03-31 0.7834
1996-03-31 0.3121
2000-03-31 -1.2050
2004-03-31 0.6017
2008-03-31 0.1045
...
2020-03-31 1.1037
2024-03-31 0.5139
2028-03-31 -0.8116
2032-03-31 -0.6939
2036-03-31 -1.1999
dtype: float64
# do delete these row
s[~mask]
来源:https://stackoverflow.com/questions/30997007/pandas-dataframe-delete-specific-date-in-all-leap-years