I am not sure I understand the parameter min_periods
in Pandas rolling
functions : why does it have to be smaller than the window
para
the min_period = n
option simply means that you require at least n
valid observations to compute your rolling stats.
Example, suppose min_period = 5
and you have a rolling mean
over the last 10
observations. Now, what happens if 6
of the last 10
observations are actually missing values? Then, given that 4<5
(indeed, there are only 4 non-missing values here and you require at least 5 non-missing observations), the rolling mean will be missing as well.
It's a very, very important option.
From the documentation
min_periods : int, default None Minimum number of observations in window required to have a value (otherwise result is NA).
The min period argument is just a way to apply the function to a smaller sample than the rolling window. So let say you want the rolling minimum of window of 10, passing the min period argument of 5 would allow to calculate the min of the first 5 data, then the first 6, then 7,8,9 and finally 10. Now that pandas can start rolling his 10 data point windows, because it has more than 10 data point, it will keep period window of 10.