I have two numpy arrays light_points and time_points and would like to use some time series analysis methods on those data.
I then tried this :
impor
seasonal_decompose()
requires a freq
that is either provided as part of the DateTimeIndex
meta information, can be inferred by pandas.Index.inferred_freq
or else by the user as an integer
that gives the number of periods per cycle. e.g., 12 for monthly (from docstring
for seasonal_mean
):
def seasonal_decompose(x, model="additive", filt=None, freq=None): """ Parameters ---------- x : array-like Time series model : str {"additive", "multiplicative"} Type of seasonal component. Abbreviations are accepted. filt : array-like The filter coefficients for filtering out the seasonal component. The default is a symmetric moving average. freq : int, optional Frequency of the series. Must be used if x is not a pandas object with a timeseries index.
To illustrate - using random sample data:
length = 400
x = np.sin(np.arange(length)) * 10 + np.random.randn(length)
df = pd.DataFrame(data=x, index=pd.date_range(start=datetime(2015, 1, 1), periods=length, freq='w'), columns=['value'])
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 400 entries, 2015-01-04 to 2022-08-28
Freq: W-SUN
decomp = sm.tsa.seasonal_decompose(df)
data = pd.concat([df, decomp.trend, decomp.seasonal, decomp.resid], axis=1)
data.columns = ['series', 'trend', 'seasonal', 'resid']
Data columns (total 4 columns):
series 400 non-null float64
trend 348 non-null float64
seasonal 400 non-null float64
resid 348 non-null float64
dtypes: float64(4)
memory usage: 15.6 KB
So far, so good - now randomly dropping elements from the DateTimeIndex
to create unevenly space data:
df = df.iloc[np.unique(np.random.randint(low=0, high=length, size=length * .8))]
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 222 entries, 2015-01-11 to 2022-08-21
Data columns (total 1 columns):
value 222 non-null float64
dtypes: float64(1)
memory usage: 3.5 KB
df.index.freq
None
df.index.inferred_freq
None
Running the seasonal_decomp
on this data 'works':
decomp = sm.tsa.seasonal_decompose(df, freq=52)
data = pd.concat([df, decomp.trend, decomp.seasonal, decomp.resid], axis=1)
data.columns = ['series', 'trend', 'seasonal', 'resid']
DatetimeIndex: 224 entries, 2015-01-04 to 2022-08-07
Data columns (total 4 columns):
series 224 non-null float64
trend 172 non-null float64
seasonal 224 non-null float64
resid 172 non-null float64
dtypes: float64(4)
memory usage: 8.8 KB
The question is - how useful is the result. Even without gaps in the data that complicate inference of seasonal patterns (see example use of .interpolate()
in the release notes, statsmodels
qualifies this procedure as follows:
Notes ----- This is a naive decomposition. More sophisticated methods should be preferred. The additive model is Y[t] = T[t] + S[t] + e[t] The multiplicative model is Y[t] = T[t] * S[t] * e[t] The seasonal component is first removed by applying a convolution filter to the data. The average of this smoothed series for each period is the returned seasonal component.