tensorflow-datasets

Shuffling tfrecords files

我是研究僧i 提交于 2019-12-22 10:02:33
问题 I have 5 tfrecords files, one for each object. While training I want to read data equally from all the 5 tfrecords i.e. if my batch size is 50, I should get 10 samples from 1st tfrecord file, 10 samples from the second tfrecord file and so on. Currently, it just reads sequentially from all the three files i.e. I get 50 samples from the same record. Is there a way to sample from differnt tfrecords files? 回答1: I advise you to read the tutorial by @mrry on tf.data . On slide 42 he explains how

Inference with a model trained with tf.Dataset

青春壹個敷衍的年華 提交于 2019-12-22 06:24:54
问题 I have trained a model using the tf.data.Dataset API, so my training code looks something like this with graph.as_default(): dataset = tf.data.TFRecordDataset(tfrecord_path) dataset = dataset.map(scale_features, num_parallel_calls=n_workers) dataset = dataset.shuffle(10000) dataset = dataset.padded_batch(batch_size, padded_shapes={...}) handle = tf.placeholder(tf.string, shape=[]) iterator = tf.data.Iterator.from_string_handle(handle, train_dataset.output_types, train_dataset.output_shapes)

“TypeError: 'Tensor' object is not iterable” error with tensorflow Estimator

自作多情 提交于 2019-12-22 05:09:05
问题 I have a procedurally generated (infinite) data source and am trying to use this as input to the high-level Tensorflow Estimator to train a image-based 3D object detector. I set up the Dataset just as in the Tensorflor Estimator Quickstart, and my dataset_input_fn returns a tuple of features and labels Tensor 's, just as the Estimator.train function specifies, and how this tutorial shows, but I am getting an error when trying to call the train function: TypeError: 'Tensor' object is not

How to have predictions AND labels returned with tf.estimator (either with predict or eval method)?

爱⌒轻易说出口 提交于 2019-12-22 04:29:46
问题 I am working with Tensorflow 1.4. I created a custom tf.estimator in order to do classification, like this: def model_fn(): # Some operations here [...] return tf.estimator.EstimatorSpec(mode=mode, predictions={"Preds": predictions}, loss=cost, train_op=loss, eval_metric_ops=eval_metric_ops, training_hooks=[summary_hook]) my_estimator = tf.estimator.Estimator(model_fn=model_fn, params=model_params, model_dir='/my/directory') I can train it easily: input_fn = create_train_input_fn(path=train

How to cache data during the first epoch correctly (Tensorflow, dataset)?

给你一囗甜甜゛ 提交于 2019-12-21 19:56:56
问题 I'm trying to used the cache transformation for a dataset . Here is my current code (simplified): dataset = tf.data.TFRecordDataset(filenames, num_parallel_reads=1) dataset = dataset.apply(tf.contrib.data.shuffle_and_repeat(buffer_size=5000, count=1)) dataset = dataset.map(_parser_a, num_parallel_calls=12) dataset = dataset.padded_batch( 20, padded_shapes=padded_shapes, padding_values=padding_values ) dataset = dataset.prefetch(buffer_size=1) dataset = dataset.cache() After the first epoch, I

How to use Keras generator with tf.data API

Deadly 提交于 2019-12-21 05:38:21
问题 I am trying to use the generator found in Keras preprocessing library. I wanted to experiment with this since Keras provides great functions for image augmentation. However, I am not sure if this is actually possible. Here is how I made a tf dataset from the Keras generator: def make_generator(): train_datagen = ImageDataGenerator(rescale=1. / 255) train_generator = train_datagen.flow_from_directory(train_dataset_folder,target_size=(224, 224), class_mode='categorical', batch_size=32) return

How to switch between training and validation dataset with tf.MonitoredTrainingSession?

假如想象 提交于 2019-12-20 21:42:10
问题 I want to use feedable iterator design in tensorflow Dataset API, so I can switch to validation data after some training steps. But if I switched to validation data, it will end the whole session. The following code demonstrate what I want to do: import tensorflow as tf graph = tf.Graph() with graph.as_default(): training_ds = tf.data.Dataset.range(32).batch(4) validation_ds = tf.data.Dataset.range(8).batch(4) handle = tf.placeholder(tf.string, shape=[]) iterator = tf.data.Iterator.from

How to switch between training and validation dataset with tf.MonitoredTrainingSession?

安稳与你 提交于 2019-12-20 21:42:10
问题 I want to use feedable iterator design in tensorflow Dataset API, so I can switch to validation data after some training steps. But if I switched to validation data, it will end the whole session. The following code demonstrate what I want to do: import tensorflow as tf graph = tf.Graph() with graph.as_default(): training_ds = tf.data.Dataset.range(32).batch(4) validation_ds = tf.data.Dataset.range(8).batch(4) handle = tf.placeholder(tf.string, shape=[]) iterator = tf.data.Iterator.from

How to use py_func with a function that returns dict

99封情书 提交于 2019-12-20 21:24:10
问题 I'm writing an input pipeline using tf.data.Dataset . I'd like to use python code to load and transform my samples, the code returns a dictionary of tensors. Unfortunately I don't see how I can define that as the output type that is passed to tf.py_func . I have a workaround where my function returns list of tensors instead of a dictionary, but it makes my code less readable as I have 4 keys in that dict. The code looks somehow as follows file_list = .... def load(file_name): return {"image":

Tensorflow Data API - prefetch

ぃ、小莉子 提交于 2019-12-20 18:33:43
问题 I am trying to use new features of TF, namely Data API, and I am not sure how prefetch works. In the code below def dataset_input_fn(...) dataset = tf.data.TFRecordDataset(filenames, compression_type="ZLIB") dataset = dataset.map(lambda x:parser(...)) dataset = dataset.map(lambda x,y: image_augmentation(...) , num_parallel_calls=num_threads ) dataset = dataset.shuffle(buffer_size) dataset = dataset.batch(batch_size) dataset = dataset.repeat(num_epochs) iterator = dataset.make_one_shot