How to perform multiclass multioutput classification using lstm

问题

I have multiclass multioutput classification (see https://scikit-learn.org/stable/modules/multiclass.html for details). In other words, my dataset looks as follows.

node_name, timeseries_1, timeseries_2, label_1, label_2
node1, [1.2, ...], [1.8, ...], 0, 2
node2, [1.0, ...], [1.1, ...], 1, 1
node3, [1.9, ...], [1.2, ...], 0, 3 
...
...
...

So, my label_1 could be either 0 or 1, whereas my label_2 could be either 0, 1, or 2.

My current code is as follows.

def create_network():
    model = Sequential()
    model.add(LSTM(200, input_shape=(16,2)))
    model.add(Dense(100))
    model.add(Dropout(0.2))
    model.add(Dense(3, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

    return model

neural_network = KerasClassifier(build_fn=create_network, epochs=100, batch_size=100, verbose=0)

k_fold = StratifiedKFold(n_splits=10, shuffle=True, random_state=0)

scores = cross_validate(neural_network, my_features, label_data_encoded, cv=k_fold, scoring = ('accuracy', 'precision_weighted', 'recall_weighted', 'f1_weighted', 'roc_auc'))

My questions are as follows.

Since I have two labels (i.e. label_1 and label_2), how to fit these labels to lstm model? Do I have to do something like keras.utils.to_categorical(label_1, 2) and keras.utils.to_categorical(label_2, 3)?
How to change the model in order to make it suitable for multiclass multioutput classification?

I am happy to provide more details if needed.

回答1:

If I understand correctly, label_1 is binary, whereas label_2 is a multiclass problem, so we need the model to have two outputs with separate loss functions; binary and categorical crossentropy respectively.

However, Sequential API does not allow multiple input/output.

The Sequential API allows you to create models layer-by-layer for most problems. It is limited in that it does not allow you to create models that share layers or have multiple inputs or outputs.

You can use the functional API to create two output layers, and compile the model with required loss functions.

X=Input(input_shape)
X=Layer(X)
'
'
'
'
out1=Dense(1, activation='sigmoid')(X)
out2=Dense(3, activation='softmax')(X)
model = Model(inputs = input, outputs = [out1,out2])
model.compile(loss = ['binary_crossentropy','categorical_crossentropy'], loss_weights = [l1,l2], ...)

model.fit(input,[label_1, label_2_toCategotical]

The loss that the network will minimize will be the weighted sum of the 2 losses, weighted by l1 and l2.

Hope this helps :)

回答2:

This is a somewhat complicated problem, since the Scikit-Learn API and Keras API for multiclass multi-output are not directly compatible. Further, there are even differences in how TensorFlow v1 and v2 handle things. The existing Keras wrappers don't really work for more complex cases.

I created an extension of KerasClassifier that is able to deal with these situations, the package and documentation are here (GitHub). Full disclosure: I am the the creator of the package, but I have no financial interests, it's open source.

With these extended versions, you can easily handle multiclass multi-output problems. I think for your situation it should work out of the box, but if not you can just inherit from KerasClassifier and overwrite _pre_process_y and _post_process_y to transform from the Scikit-Learn data format to whatever your Keras Model uses. More details here (docs).

Hope this helps!

来源：https://stackoverflow.com/questions/62077273/how-to-perform-multiclass-multioutput-classification-using-lstm

标签

python

keras

scikit-learn

classification

lstm