What is the replace for softmax layer in case more than one output can be activated?

强颜欢笑 提交于 2019-12-23 12:32:53

问题


For example, I have CNN which tries to predict numbers from MNIST dataset (code written using Keras). It has 10 outputs, which form softmax layer. Only one of outputs can be true (independently for each digit from 0 to 9):

Real: [0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
Predicted: [0.02, 0.9, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01]

Sum of predicted is equal to 1.0 due to definition of softmax.

Let's say I have a task where I need to classify some objects that can fall in several categories:

Real: [0, 1, 0, 1, 0, 1, 0, 0, 0, 1]

So I need to normalize in some other way. I need function which gives value on range [0, 1] and which sum can be larger than 1.

I need something like that:

Predicted: [0.1, 0.9, 0.05, 0.9, 0.01, 0.8, 0.1, 0.01, 0.2, 0.9]

Each number is probability that object falls in given category. After that I can use some threshold like 0.5 to distinguish categories in which given object falls.

The following questions appear:

  1. So which activation function can be used for this?
  2. May be this function already exists in Keras?
  3. May be you can propose some other way to predict in this case?

回答1:


Your problem is one of multi-label classification, and in the context of Keras it is discussed, for example, here: https://github.com/fchollet/keras/issues/741

In short the suggested solution for it in keras is to replace the softmax layer with a sigmoid layer and use binary_crossentropy as your cost function.

an example from that thread:

# Build a classifier optimized for maximizing f1_score (uses class_weights)

clf = Sequential()

clf.add(Dropout(0.3))
clf.add(Dense(xt.shape[1], 1600, activation='relu'))
clf.add(Dropout(0.6))
clf.add(Dense(1600, 1200, activation='relu'))
clf.add(Dropout(0.6))
clf.add(Dense(1200, 800, activation='relu'))
clf.add(Dropout(0.6))
clf.add(Dense(800, yt.shape[1], activation='sigmoid'))

clf.compile(optimizer=Adam(), loss='binary_crossentropy')

clf.fit(xt, yt, batch_size=64, nb_epoch=300, validation_data=(xs, ys), class_weight=W, verbose=0)

preds = clf.predict(xs)

preds[preds>=0.5] = 1
preds[preds<0.5] = 0

print f1_score(ys, preds, average='macro')


来源:https://stackoverflow.com/questions/41589126/what-is-the-replace-for-softmax-layer-in-case-more-than-one-output-can-be-activa

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!