Only predict a single time-series-sample within a batch using predict() or predict_on_batch()

烂漫一生 提交于 2020-01-15 15:26:03

问题


In stackoverflow-questions like this one here I read about the batch_size-parameter in Keras' predict()-method or in questions like this one about the difference of predict() and predict_on_batch().

Anyhow, my question was not answered in them. I understand the concept of the batch_size and that I can predict a single batch with predict_on_batch(). But what I want to achieve is predicting a single sample from a batch with multiple samples. The prediction in this case is a time-series prediction.

An example

Let's asume I've got a batch of shape(2,5,1) for the samples as well as targets. These batches are fed to a model like this:

sequence_size      = 5
number_of_features = 1
input              = (sequence_size, number_of_features)
batch_size         = 2

model = Sequential()
model.add(GRU(100, return_sequences=True, activation='relu', input_shape=input, batch_size=2, name="GRU"))
model.add(GRU(1, return_sequences=True, activation='relu', input_shape=input, batch_size=batch_size, name="GRU2"))
model.compile(optimizer='adam', loss='mse')

model.summary()

#Summary-output:
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
GRU (GRU)                    (2, 5, 100)               30600     
_________________________________________________________________
GRU2 (GRU)                   (2, 5, 1)                 306       
=================================================================
Total params: 30,906
Trainable params: 30,906
Non-trainable params: 0

When I fit this model and predict something in shape of a single batch (2, 5, 1), the predict() works.

def generator(data, batch_size, sequence_size, num_features):
    """Simple generator"""
    while True:
        for i in range(len(data) - (sequence_size * batch_size + sequence_size) + 1):
            start = i
            end   = i + (sequence_size * batch_size)

            yield data[start : end].reshape(batch_size, sequence_size, num_features), \
                    data[end - ((sequence_size * batch_size) - sequence_size) : end + sequence_size].reshape(batch_size, sequence_size, num_features)

#Task: Predict the continuation of a linear range
data = np.arange(100)
hist = model.fit_generator(
                generator=generator(data, batch_size, sequence_size, num_features, False),
                steps_per_epoch=total_batches,
                epochs=200,
                shuffle=False
            )


to_predict = np.asarray([[np.asarray([x]) for x in range(105-sequence_size*batch_size,105,1)]]).reshape(batch_size, sequence_size, num_features)
correct_result    = np.asarray([100,101,102,103,104])
print( model.predict(to_predict).flatten()[0:sequence_size] )

#Prediction output something close to what I want (correct_result)
[ 99.92908 100.95854 102.32129 103.28584 104.20213 ]

For the sake of a better understanding; the generator-output per batch looks like:

(array([[ [0],[1],[2],[3],[4] ], [ [5],[6],[7],[8],[9] ]]),   #Sample of shape (2, 5, 1), meaning two sequences of length 5 with one feature
 array([[[5],[6],[7],[8],[9]], [[10],[11],[12],[13],[14]]]))  #Target of shape (2, 5, 1), meaning two sequences of length 5 with one feature

But what I want to predict a shape like (1, 5, 1). The reason: A single batch contains two sequences (hence two samples) of respectively 5 time-steps. When I provide a shape of (2, 5, 1) this means that I have to use two sequences for a prediction and predict() also returns the prediction for the following two sequences (since the training-target was also in shape of (2, 5, 1)).

What I can do so far is to input a sequence that shall be predicted (in shape(1, 5, 1) with an output of (1, 5, 1)) in predict_on_batch() and only return the first result (predict_on_batch() also returns a shape of (2, 5, 1), but both samples in there yield the same predicted values).

to_predict = np.asarray([[np.asarray([x]) for x in range(95,100,1)]])
correct    = np.asarray([100,101,102,103,104])
print( model.predict_on_batch(to_predict)[0].flatten() )

#Output:
[ 99.92908 100.95854 102.32129 103.28584 104.20213 ]

As can be seen, the output is the same as with predict(), but I only inputted a single sample from a batch in shape(1, 5, 1). Additionally, predict_on_batch() did not throw any error. Since the results are the same, there is obviously no reason to provide an entire batch (in my example a batch only contains of two samples, but of course this can be higher in real tasks).


Thus, I got a workaround for my question. But I have the feeling that I'm missing a method that allows this without workarounds. So my question is:


How can I predict a singe sample from a single batch, whereby this prediciton also only outputs a single prediction?


Edit:
I know about this tutorial on machinelearningmastery.com. Anyhow, this also just provides workarounds (I also consider copying weights to a new network a workaround).

来源:https://stackoverflow.com/questions/56464945/only-predict-a-single-time-series-sample-within-a-batch-using-predict-or-predi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!