tensorflow-datasets | 易学教程

Shuffling tfrecords files

阅读更多关于 Shuffling tfrecords files

问题 I have 5 tfrecords files, one for each object. While training I want to read data equally from all the 5 tfrecords i.e. if my batch size is 50, I should get 10 samples from 1st tfrecord file, 10 samples from the second tfrecord file and so on. Currently, it just reads sequentially from all the three files i.e. I get 50 samples from the same record. Is there a way to sample from differnt tfrecords files? 回答1: I advise you to read the tutorial by @mrry on tf.data . On slide 42 he explains how

Inference with a model trained with tf.Dataset

阅读更多关于 Inference with a model trained with tf.Dataset

问题 I have trained a model using the tf.data.Dataset API, so my training code looks something like this with graph.as_default(): dataset = tf.data.TFRecordDataset(tfrecord_path) dataset = dataset.map(scale_features, num_parallel_calls=n_workers) dataset = dataset.shuffle(10000) dataset = dataset.padded_batch(batch_size, padded_shapes={...}) handle = tf.placeholder(tf.string, shape=[]) iterator = tf.data.Iterator.from_string_handle(handle, train_dataset.output_types, train_dataset.output_shapes)

“TypeError: 'Tensor' object is not iterable” error with tensorflow Estimator

阅读更多关于 “TypeError: 'Tensor' object is not iterable” error with tensorflow Estimator

问题 I have a procedurally generated (infinite) data source and am trying to use this as input to the high-level Tensorflow Estimator to train a image-based 3D object detector. I set up the Dataset just as in the Tensorflor Estimator Quickstart, and my dataset_input_fn returns a tuple of features and labels Tensor 's, just as the Estimator.train function specifies, and how this tutorial shows, but I am getting an error when trying to call the train function: TypeError: 'Tensor' object is not

How to have predictions AND labels returned with tf.estimator (either with predict or eval method)?

阅读更多关于 How to have predictions AND labels returned with tf.estimator (either with predict or eval method)?

问题 I am working with Tensorflow 1.4. I created a custom tf.estimator in order to do classification, like this: def model_fn(): # Some operations here [...] return tf.estimator.EstimatorSpec(mode=mode, predictions={"Preds": predictions}, loss=cost, train_op=loss, eval_metric_ops=eval_metric_ops, training_hooks=[summary_hook]) my_estimator = tf.estimator.Estimator(model_fn=model_fn, params=model_params, model_dir='/my/directory') I can train it easily: input_fn = create_train_input_fn(path=train

How to cache data during the first epoch correctly (Tensorflow, dataset)?

阅读更多关于 How to cache data during the first epoch correctly (Tensorflow, dataset)?

问题 I'm trying to used the cache transformation for a dataset . Here is my current code (simplified): dataset = tf.data.TFRecordDataset(filenames, num_parallel_reads=1) dataset = dataset.apply(tf.contrib.data.shuffle_and_repeat(buffer_size=5000, count=1)) dataset = dataset.map(_parser_a, num_parallel_calls=12) dataset = dataset.padded_batch( 20, padded_shapes=padded_shapes, padding_values=padding_values ) dataset = dataset.prefetch(buffer_size=1) dataset = dataset.cache() After the first epoch, I

How to use Keras generator with tf.data API

阅读更多关于 How to use Keras generator with tf.data API

问题 I am trying to use the generator found in Keras preprocessing library. I wanted to experiment with this since Keras provides great functions for image augmentation. However, I am not sure if this is actually possible. Here is how I made a tf dataset from the Keras generator: def make_generator(): train_datagen = ImageDataGenerator(rescale=1. / 255) train_generator = train_datagen.flow_from_directory(train_dataset_folder,target_size=(224, 224), class_mode='categorical', batch_size=32) return

How to switch between training and validation dataset with tf.MonitoredTrainingSession?

阅读更多关于 How to switch between training and validation dataset with tf.MonitoredTrainingSession?

问题 I want to use feedable iterator design in tensorflow Dataset API, so I can switch to validation data after some training steps. But if I switched to validation data, it will end the whole session. The following code demonstrate what I want to do: import tensorflow as tf graph = tf.Graph() with graph.as_default(): training_ds = tf.data.Dataset.range(32).batch(4) validation_ds = tf.data.Dataset.range(8).batch(4) handle = tf.placeholder(tf.string, shape=[]) iterator = tf.data.Iterator.from

How to switch between training and validation dataset with tf.MonitoredTrainingSession?

阅读更多关于 How to switch between training and validation dataset with tf.MonitoredTrainingSession?

How to use py_func with a function that returns dict

阅读更多关于 How to use py_func with a function that returns dict

问题 I'm writing an input pipeline using tf.data.Dataset . I'd like to use python code to load and transform my samples, the code returns a dictionary of tensors. Unfortunately I don't see how I can define that as the output type that is passed to tf.py_func . I have a workaround where my function returns list of tensors instead of a dictionary, but it makes my code less readable as I have 4 keys in that dict. The code looks somehow as follows file_list = .... def load(file_name): return {"image":

Tensorflow Data API - prefetch

阅读更多关于 Tensorflow Data API - prefetch

问题 I am trying to use new features of TF, namely Data API, and I am not sure how prefetch works. In the code below def dataset_input_fn(...) dataset = tf.data.TFRecordDataset(filenames, compression_type="ZLIB") dataset = dataset.map(lambda x:parser(...)) dataset = dataset.map(lambda x,y: image_augmentation(...) , num_parallel_calls=num_threads ) dataset = dataset.shuffle(buffer_size) dataset = dataset.batch(batch_size) dataset = dataset.repeat(num_epochs) iterator = dataset.make_one_shot