Binary classification with softmax activation always outputs 1

问题

Sorry for the quality of the question but a beginner here , I was just trying my luck with titanic dataset, but it always predicts that the passenger died. I try to explain code below:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns


import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import losses
from tensorflow.keras.layers.experimental import preprocessing

import os

Load dataset

dataset_dir = os.path.join(os.getcwd(), 'titanic')
train_url = os.path.join(dataset_dir, 'train.csv')
test_url = os.path.join(dataset_dir, 'test.csv')


raw_train_dataset = pd.read_csv(train_url)
raw_test_dataset = pd.read_csv(test_url)


train = raw_train_dataset.copy()
test = raw_test_dataset.copy()

Dropping some columns , I may be wrong here

train = train.drop(['Cabin','Name','Ticket'], 1)
test = test.drop(['Cabin','Name','Ticket'], 1)

Hot vector

train = pd.get_dummies(train, prefix='', prefix_sep='')
test = pd.get_dummies(test, prefix='', prefix_sep='')

training labels

train_predict = train.pop('Survived')

Filling null ages with mean

train['Age'].fillna((train['Age'].mean()), inplace=True)
test['Age'].fillna((train['Age'].mean()), inplace=True)

Dropping null columns

test = test.dropna()
train = train.dropna()

Creating Normalisation layer

normalizer = preprocessing.Normalization()
normalizer.adapt(np.array(train))

Creating dnn , am I wrong here

model = keras.Sequential([
      normalizer,
      layers.Dense(64, activation='relu'),
      layers.Dropout(0.2),
      layers.Dense(1)
  ])



model.compile(loss=losses.BinaryCrossentropy(from_logits=True),
              optimizer='adam',
              metrics=tf.metrics.BinaryAccuracy(threshold=0.0))


history = model.fit(
    train, train_predict,
    validation_split=0.2,
     epochs=30)

This shows 1 in every case but I still get accuracy of 85% when training , I don't need complete solution of the problem(I want to try on my own) but just the part where I am stuck

result = tf.nn.softmax(model(train))
print(result)

回答1:

tf.nn.softmax will always return an array of sum=1. Since your output is 1 value (you have one unit on your final/output layer), a softmax operation will transform this value to 1.

for value in [.2, .999, .0001, 100., -100.]:
    print(tf.nn.softmax([value]))

tf.Tensor([1.], shape=(1,), dtype=float32)
tf.Tensor([1.], shape=(1,), dtype=float32)
tf.Tensor([1.], shape=(1,), dtype=float32)
tf.Tensor([1.], shape=(1,), dtype=float32)
tf.Tensor([1.], shape=(1,), dtype=float32)

What you're looking for is tf.nn.sigmoid:

for value in [.2, .999, .0001, 100., -100.]:
    print(tf.nn.sigmoid([value]))

tf.Tensor([0.549834], shape=(1,), dtype=float32)
tf.Tensor([0.7308619], shape=(1,), dtype=float32)
tf.Tensor([0.500025], shape=(1,), dtype=float32)
tf.Tensor([1.], shape=(1,), dtype=float32)
tf.Tensor([0.], shape=(1,), dtype=float32)

losses.BinaryCrossentropy(from_logits=True) is like sigmoid crossentropy.

If you want to round the values to get 0 or 1, use tf.round: