How do I determine the binary class predicted by a convolutional neural network on Keras?

问题

I'm building a CNN to perform sentiment analysis on Keras. Everything is working perfectly, the model is trained and ready to be launched to production.

However, when I try to predict on new unlabelled data by using the method model.predict() it only outputs the associated probability. I tried to use the method np.argmax() but it always outputs 0 even when it should be 1 (on test set, my model achieved 80% of accuracy).

Here is my code to pre-process the data:

# Pre-processing data
x = df[df.Sentiment != 3].Headlines
y = df[df.Sentiment != 3].Sentiment

# Splitting training, validation, testing dataset
x_train, x_validation_and_test, y_train, y_validation_and_test = train_test_split(x, y, test_size=.3,
                                                                                      random_state=SEED)
x_validation, x_test, y_validation, y_test = train_test_split(x_validation_and_test, y_validation_and_test,
                                                                  test_size=.5, random_state=SEED)

tokenizer = Tokenizer(num_words=NUM_WORDS)
tokenizer.fit_on_texts(x_train)

sequences = tokenizer.texts_to_sequences(x_train)
x_train_seq = pad_sequences(sequences, maxlen=MAXLEN)

sequences_val = tokenizer.texts_to_sequences(x_validation)
x_val_seq = pad_sequences(sequences_val, maxlen=MAXLEN)

sequences_test = tokenizer.texts_to_sequences(x_test)
x_test_seq = pad_sequences(sequences_test, maxlen=MAXLEN)

And here is my model:

MAXLEN = 25
NUM_WORDS = 5000
VECTOR_DIMENSION = 100

tweet_input = Input(shape=(MAXLEN,), dtype='int32')

tweet_encoder = Embedding(NUM_WORDS, VECTOR_DIMENSION, input_length=MAXLEN)(tweet_input)

# Combinating n-gram to optimize results
bigram_branch = Conv1D(filters=100, kernel_size=2, padding='valid', activation="relu", strides=1)(tweet_encoder)
bigram_branch = GlobalMaxPooling1D()(bigram_branch)
trigram_branch = Conv1D(filters=100, kernel_size=3, padding='valid', activation="relu", strides=1)(tweet_encoder)
trigram_branch = GlobalMaxPooling1D()(trigram_branch)
fourgram_branch = Conv1D(filters=100, kernel_size=4, padding='valid', activation="relu", strides=1)(tweet_encoder)
fourgram_branch = GlobalMaxPooling1D()(fourgram_branch)
merged = concatenate([bigram_branch, trigram_branch, fourgram_branch], axis=1)

merged = Dense(256, activation="relu")(merged)
merged = Dropout(0.25)(merged)
output = Dense(1, activation="sigmoid")(merged)

optimizer = optimizers.adam(0.01)

model = Model(inputs=[tweet_input], outputs=[output])
model.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=['accuracy'])
model.summary()

# Training the model
history = model.fit(x_train_seq, y_train, batch_size=32, epochs=5, validation_data=(x_val_seq, y_validation))

I also tried to change the number of activations on the final Dense layer from 1 to 2, but I get an error:

Error when checking target: expected dense_12 to have shape (2,) but got array with shape (1,)

回答1:

You are doing binary classification. So you have a Dense layer consisting of one unit with an activation function of sigmoid. Sigmoid function outputs a value in range [0,1] which corresponds to the probability of the given sample belonging to positive class (i.e. class one). Everything below 0.5 is labeled with zero (i.e. negative class) and everything above 0.5 is labeled with one. So to find the predicted class you can do the following:

preds = model.predict(data)
class_one = preds > 0.5

The true elements of class_one correspond to samples labeled with one (i.e. positive class).

Bonus: to find the accuracy of your predictions you can easily compare class_one with the true labels:

acc = np.mean(class_one == true_labels)

Note that I have assumed that true_labels consists of zeros and ones.

Further, if your model were defined using Sequential class, then you could easily use predict_classes method:

pred_labels = model.predict_classes(data)

However, since you are using Keras functional API to construct your model (which is a very good thing to do so, in my opinion), you can't use predict_classes method since it is ill-defined for such models.

来源：https://stackoverflow.com/questions/52018645/how-do-i-determine-the-binary-class-predicted-by-a-convolutional-neural-network

标签

python

machine-learning

keras

deep-learning

text-classification