Tensorflow keras with tf dataset input

最后都变了- 提交于 2019-12-12 14:49:42

问题


I'm new to tensorflow keras and dataset. Can anyone help me understand why the following code doesn't work?

import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
from tensorflow.python.data.ops import dataset_ops
from tensorflow.python.data.ops import iterator_ops
from tensorflow.python.keras.utils import multi_gpu_model
from tensorflow.python.keras import backend as K


data = np.random.random((1000,32))
labels = np.random.random((1000,10))
dataset = tf.data.Dataset.from_tensor_slices((data,labels))
print( dataset)
print( dataset.output_types)
print( dataset.output_shapes)
dataset.batch(10)
dataset.repeat(100)

inputs = keras.Input(shape=(32,))  # Returns a placeholder tensor

# A layer instance is callable on a tensor, and returns a tensor.
x = keras.layers.Dense(64, activation='relu')(inputs)
x = keras.layers.Dense(64, activation='relu')(x)
predictions = keras.layers.Dense(10, activation='softmax')(x)

# Instantiate the model given inputs and outputs.
model = keras.Model(inputs=inputs, outputs=predictions)

# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
          loss='categorical_crossentropy',
          metrics=['accuracy'])

# Trains for 5 epochs
model.fit(dataset, epochs=5, steps_per_epoch=100)

It failed with the following error:

model.fit(x=dataset, y=None, epochs=5, steps_per_epoch=100)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 1510, in fit
validation_split=validation_split)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 994, in _standardize_user_data
class_weight, batch_size)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 1113, in _standardize_weights
exception_prefix='input')
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training_utils.py", line 325, in standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking input: expected input_1 to have 2 dimensions, but got array with shape (32,)

According to tf.keras guide, I should be able to directly pass the dataset to model.fit, as this example shows:

Input tf.data datasets

Use the Datasets API to scale to large datasets or multi-device training. Pass a tf.data.Dataset instance to the fit method:

# Instantiates a toy dataset instance:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32)
dataset = dataset.repeat()

Don't forget to specify steps_per_epoch when calling fit on a dataset.

model.fit(dataset, epochs=10, steps_per_epoch=30) Here, the fit method uses the steps_per_epoch argument—this is the number of training steps the model runs before it moves to the next epoch. Since the Dataset yields batches of data, this snippet does not require a batch_size.

Datasets can also be used for validation:

dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32).repeat()

val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_labels))
val_dataset = val_dataset.batch(32).repeat()

model.fit(dataset, epochs=10, steps_per_epoch=30,
      validation_data=val_dataset,
      validation_steps=3)

What's the problem with my code, and what's the correct way of doing it?


回答1:


To your original question as to why you're getting the error:

Error when checking input: expected input_1 to have 2 dimensions, but got array with shape (32,)

The reason your code breaks is because you haven't applied the .batch() back to the dataset variable, like so:

dataset = dataset.batch(10)

You simply called dataset.batch().

This breaks because without the batch() the output tensors are not batched, i.e. you get shape (32,) instead of (1,32).




回答2:


You are missing defining an iterator which is the reason why there is an error.

data = np.random.random((1000,32))
labels = np.random.random((1000,10))
dataset = tf.data.Dataset.from_tensor_slices((data,labels))
dataset = dataset.batch(10).repeat()
inputs = Input(shape=(32,))  # Returns a placeholder tensor

# A layer instance is callable on a tensor, and returns a tensor.
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)

# Instantiate the model given inputs and outputs.
model = keras.Model(inputs=inputs, outputs=predictions)

# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
          loss='categorical_crossentropy',
          metrics=['accuracy'])

# Trains for 5 epochs
model.fit(dataset.make_one_shot_iterator(), epochs=5, steps_per_epoch=100)

Epoch 1/5 100/100 [==============================] - 1s 8ms/step - loss: 11.5787 - acc: 0.1010

Epoch 2/5 100/100 [==============================] - 0s 4ms/step - loss: 11.4846 - acc: 0.0990

Epoch 3/5 100/100 [==============================] - 0s 4ms/step - loss: 11.4690 - acc: 0.1270

Epoch 4/5 100/100 [==============================] - 0s 4ms/step - loss: 11.4611 - acc: 0.1300

Epoch 5/5 100/100 [==============================] - 0s 4ms/step - loss: 11.4546 - acc: 0.1360

This is the result on my system.



来源:https://stackoverflow.com/questions/52736517/tensorflow-keras-with-tf-dataset-input

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!