问题
I have time series, and I applying some user defined function to every W elements in time series.
Right now I am just using for loop, slide window of size W an apply my function to elements in a window at every iteration.
I am using Matlab and it is very inefficient with a "for loops" so I would love to vectorize this operation.
As a solution I see transforming signal with length N to a matrix with size (N - 1, W) where each row is time series in different windows and applying function to this matrix.
So, my questions are:
- How to transform my initial time series to such a matrix?
- Let's say I am sliding window with step X. So that not (N - 1, W) matrix will appear, but ((N - 1) / X, W). (Every Xth row of the matrix in [1])
Example:
Let's say my time series is:
T = [1, 5, 6, 8, 10, 14, 22]
W = 3
X = 1
=> I would love to get
[[1, 5, 6],
[5, 6, 8],
[6, 8, 10],
[8, 10, 14],
[10, 14, 22]]
If
W = 3
X = 2
=> I would love to get
[[1, 5, 6],
[6, 8, 10],
[10, 14, 22]]
回答1:
Creating the right indices with bsxfun should most certainly help:
ind = bsxfun(@plus, 1:W, (0:X:numel(T)-W).');
out = T(ind);
Creating the right indices is the first step, delineated by the first line of code. What this code does is that it creates a 2D matrix where each row are the elements to access per window of interest. If you want to gain intuition on how the code generates the indices, look specifically at the first case where X = 1;
and W = 3;
.
We can see that the first row consists of accessing elements 1, 2, 3. The second row consists of accessing elements 2, 3, 4... up until the last row, which is 5, 6, 7. We can see that we have to access neighbouring elements in a window, and so the base indices need to go from 1, 2, 3, or in general from 1 to W
. We now need to offset these indices so that they are centred at the right elements in T
per window. The offset for the first window is simply 0, the next offset for the second window is simply 1 up until the last row which is 3. We see that for each row, we add 1 more to the base indices as the rows increase. Therefore, we add 1 to each base index for the second row, then 2 for each base index in the third row and so on. If you add the base indices with the offset indices, you thus finally get the correct indices to access the right elements in T
.
Similarly if X = 2;
and W = 3;
, we see that we still have base indices of 1, 2, 3. However, the right elements to access now are 1, 2, 3 for the first row, then 3, 4, 5 for the second row then 5, 6, 7 for the third row. For each row, we now offset the base indices by 2 instead of 1 now. Therefore the second row we add 2 to each base index, then we add 4 to each base index for the third row and so on.
In general, the base indices are created using a vector 1:W
and the offset indices are created using a vector 0:X:numel(T)-W
. The subtraction of W
is required so that we don't go out of bounds when sampling the signal as per the requirement. To create these indices that we just talked about, bsxfun
handles this for us.
We create a row vector of 1:W
which corresponds to the base indices and a column vector of (0:X:numel(T)-W).'
which corresponds to the offsets per window. Note that the first offset starts at 0, then we increment by X
amount to ensure that the correct centre is calculated to place our base indices at. We stop until we hit numel(T)-W
elements, which is the condition you have stated. By using bsxfun
, two temporary 2D matrices are created where the row vector is duplicated for as many rows as there are rows as there are in the column vector and the column vector is duplicated for as many columns as there are in the row vector. Once you add these two matrices together, you get the resulting index matrix.
Running the code with W = 3;
and X = 1;
gives:
>> T = [1, 5, 6, 8, 10, 14, 22];
>> X = 1;
>> W = 3;
>> ind = bsxfun(@plus, 1:W, (0:X:numel(T)-W).')
ind =
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
Similarly if W = 3;
and X = 2;
we also get:
>> T = [1, 5, 6, 8, 10, 14, 22];
>> X = 2;
>> W = 3;
>> ind = bsxfun(@plus, 1:W, (0:X:numel(T)-W).')
ind =
1 2 3
3 4 5
5 6 7
You can verify for yourself that these indices correspond to the correct elements in T
to create your desired matrix in this case.
We finally use this to index into our matrix to grab the right elements:
out = T(ind);
Doing this for X = 1;
and W = 3;
gives:
>> out = T(ind)
out =
1 5 6
5 6 8
6 8 10
8 10 14
10 14 22
Similarly for X = 2;
and W = 3;
gives:
>> out = T(ind)
out =
1 5 6
6 8 10
10 14 22
回答2:
Based on rayryeng's answer I wrote a function that does exactly this, plus some additional functionality. It's designed for generating indices for autoregression on a univariate time series. It can easily be used for the multivariate case by simply using the same indices and concatenating the referenced data.
It returns the indices to the predictor variable X (as per your request) and for the regressors y as well. In addition you have the option of applying a 'mask' to the predictor variables X while sliding the window. For example, with a window of 21 steps, you can select [T-2 T-3 T-5 T-8 T-13 T-21] for X and T for y
You can also change the prediction horizon - how many steps into the future the indices for y are. For example X = [T-1 T-2] and y = T+2
Hopefully someone else will find this useful.
% get_Sliding_Indexes:
% Useful for autoregression on a univariate time series.
% Returns the indexes for the predictor and response variables
% according to a sliding window.
%
% Copyright (C) 20016 Florin Schimbinschi
%
% Parameters:
% numRecords - the number of records in the dataset
%
% windowLag - number of past samples to take - it will be equal to
% the size of the predictor vector X. Default 10
%
% predHorizon - the prediction horizon is the number of steps into
% the future that predictions are to be made. Default 1
%
% windowPattern - by default the window will take all consecutive
% values in the past over the window lag size, however it is
% possible to sample using a custom pattern.
% For example taking every second value can be done by setting
% this parameter to 1:2:5. Default 1:windowLag
%
% stepSize - number of steps taken when window is moved. Default 1
%
% Returns:
% indX - predictor variable indexes
% indY - response variable indexes
%
%
% windowPattern = 1:2:9 __ structure between [] is moved to
% / \ / the right by stepSize units
% >------[(9-7-5-3-1)---(y)]--------------->
% \_______/ \_/
% X = [13579] predHorizon = 3
%
%
% Example on a multivariate time series (two) with 6 records:
%
% data2d = [ .1 .2 .3 .4 .5 .6
% .11 .22 .33 .44 .55 .66]';
%
% [X, y] = getSlidingIndexes(size(data2d,1), 4)
% X =
% 1 2 3 4
% 2 3 4 5
% y =
% 5
% 6
%
% Assuming we are interested in the second series (column):
%
% series2 = data2d(:,2);
%
% series2(X)
% ans =
% 0.1100 0.2200 0.3300 0.4400
% 0.2100 0.3300 0.4400 0.5500
%
% series2(y)
% ans =
% 0.5500
% 0.6600
%
function [indX, indY] = get_Sliding_Indexes(numRecords, ...
windowLag, predHorizon, windowPattern, stepSize)
if ~exist('numRecords','var') || isempty(numRecords)
error('The number of records in the dataset is not specified');
end
if ~exist('stepSize','var') || isempty(stepSize)
stepSize = 1; % steps taken when moving the window
end
if ~exist('predHorizon','var') || isempty(predHorizon)
predHorizon = 1; % aiming to predict this many steps in the future
end
if ~exist('windowLag','var') || isempty(windowLag)
windowLag = 10; % number of time steps to look back
end
if exist('windowLag','var') && (windowLag > numRecords)
error('The size of the window is larger than the number of observations');
end
if ~exist('windowPattern','var') || isempty(windowPattern)
windowPattern = 1:windowLag; % pattern of sampling data
end
if exist('windowPattern','var') && windowPattern(end) > windowLag
error('The window pattern must stop at the window lag specified');
end
% the number of samples in the window
maxSample = max(windowPattern);
indX = bsxfun(@plus, windowPattern, ...
(0:stepSize:(numRecords - maxSample - predHorizon))');
indY = bsxfun(@plus, max(windowPattern) + predHorizon, ...
(0:stepSize:(numRecords - maxSample - predHorizon))');
end
You can also find the code here: https://au.mathworks.com/matlabcentral/fileexchange/58730-get-sliding-indexes-numrecords--windowlag--predhorizon--windowpattern--stepsize-
来源:https://stackoverflow.com/questions/36800665/matrix-with-sliding-window-elements