tensorflow-datasets

How to apply data augmentation in TensorFlow 2.0 after tfds.load()

旧城冷巷雨未停 提交于 2020-05-23 08:25:29
问题 I'm following this guide. It shows how to download datasets from the new TensorFlow Datasets using tfds.load() method: import tensorflow_datasets as tfds SPLIT_WEIGHTS = (8, 1, 1) splits = tfds.Split.TRAIN.subsplit(weighted=SPLIT_WEIGHTS) (raw_train, raw_validation, raw_test), metadata = tfds.load( 'cats_vs_dogs', split=list(splits), with_info=True, as_supervised=True) The next steps shows how to apply a function to each item in the dataset using map method: def format_example(image, label):

Feeding integer CSV data to a Keras Dense first layer in sequential model

可紊 提交于 2020-05-17 06:47:05
问题 The documentation for CSV Datasets stops short of showing how to use a CSV dataset for anything practical like using the data to train a neural network. Can anyone provide a straightforward example to demonstrate how to do this, with clarity around data shape and type issues at a minimum, and preferably considering batching, shuffling, repeating over epochs as well? For example, I have a CSV file of M rows, each row being an integer class label followed by N integers from which I hope to

Tensorflow Datasets with string inputs do not preserve data type

只愿长相守 提交于 2020-04-14 06:17:15
问题 All reproducible code below is run at Google Colab with TF 2.2.0-rc2. Adapting the simple example from the documentation for creating a dataset from a simple Python list: import numpy as np import tensorflow as tf tf.__version__ # '2.2.0-rc2' np.version.version # '1.18.2' dataset1 = tf.data.Dataset.from_tensor_slices([1, 2, 3]) for element in dataset1: print(element) print(type(element.numpy())) we get the result tf.Tensor(1, shape=(), dtype=int32) <class 'numpy.int32'> tf.Tensor(2, shape=(),

Keras model fails to decrease loss

﹥>﹥吖頭↗ 提交于 2020-04-13 06:13:08
问题 I propose a example in which a tf.keras model fails to learn from very simple data. I'm using tensorflow-gpu==2.0.0 , keras==2.3.0 and Python 3.7. At the end of my post, I give the Python code to reproduce the problem I observed. Data The samples are Numpy arrays of shape (6, 16, 16, 16, 3). To make things very simple, I only consider arrays full of 1s and 0s. Arrays with 1s are given the label 1 and arrays with 0s are given the label 0. I can generate some samples (in the following, n

Get data set as numpy array from TFRecordDataset

纵然是瞬间 提交于 2020-04-10 18:10:24
问题 I'm using the new tf.data API to create an iterator for the CIFAR10 dataset. I'm reading the data from two .tfrecord files. One which holds the training data (train.tfrecords) and another one which holds the test data (test.tfrecords). This works all fine. At some point, however, I need both data sets (training data and test data) as numpy arrays . Is it possible to retrieve a data set as numpy array from a tf.data.TFRecordDataset object? 回答1: You can use the tf.data.Dataset.batch()

Get data set as numpy array from TFRecordDataset

无人久伴 提交于 2020-04-10 18:10:11
问题 I'm using the new tf.data API to create an iterator for the CIFAR10 dataset. I'm reading the data from two .tfrecord files. One which holds the training data (train.tfrecords) and another one which holds the test data (test.tfrecords). This works all fine. At some point, however, I need both data sets (training data and test data) as numpy arrays . Is it possible to retrieve a data set as numpy array from a tf.data.TFRecordDataset object? 回答1: You can use the tf.data.Dataset.batch()

K fold cross validation Tensorflow Object Detection

不打扰是莪最后的温柔 提交于 2020-03-03 14:00:35
问题 I want to evaluate my model using K-Fold Cross Validation (k=5). This means that dataset must be split in 5 parts: p1,p2,p3,p4,p5 and then: (run1) Test : p1,p2,p3,p4 Eval : p5 (run2) Test : p1,p2,p3,p4 Eval : p4 (run3) Test : p1,p2,p4,p5 Eval : p3 (run4) Test : p1,p3,p4,p5 Eval : p2 (run5) Test : p2,p3,p4,p5 Eval : p1 At the end, I calculate the average mean among all the evaluations. This is essentially K-Fold Cross validation. Right now, what I am doing is to regenerate .tf records each

K fold cross validation Tensorflow Object Detection

百般思念 提交于 2020-03-03 13:57:46
问题 I want to evaluate my model using K-Fold Cross Validation (k=5). This means that dataset must be split in 5 parts: p1,p2,p3,p4,p5 and then: (run1) Test : p1,p2,p3,p4 Eval : p5 (run2) Test : p1,p2,p3,p4 Eval : p4 (run3) Test : p1,p2,p4,p5 Eval : p3 (run4) Test : p1,p3,p4,p5 Eval : p2 (run5) Test : p2,p3,p4,p5 Eval : p1 At the end, I calculate the average mean among all the evaluations. This is essentially K-Fold Cross validation. Right now, what I am doing is to regenerate .tf records each

Split train data to train and validation by using tensorflow_datasets.load (TF 2.1)

本秂侑毒 提交于 2020-02-03 10:28:31
问题 I'm trying to run the following Colab project, but when I want to split the training data into validation and train parts I get this error: KeyError: "Invalid split train[:70%]. Available splits are: ['train']" I use the following code: (training_set, validation_set), dataset_info = tfds.load( 'tf_flowers', split=['train[:70%]', 'train[70%:]'], with_info=True, as_supervised=True, ) How I can fix this error? 回答1: According to the Tensorflow Dataset docs the percentage splitting is possible as

Split train data to train and validation by using tensorflow_datasets.load (TF 2.1)

三世轮回 提交于 2020-02-03 10:26:03
问题 I'm trying to run the following Colab project, but when I want to split the training data into validation and train parts I get this error: KeyError: "Invalid split train[:70%]. Available splits are: ['train']" I use the following code: (training_set, validation_set), dataset_info = tfds.load( 'tf_flowers', split=['train[:70%]', 'train[70%:]'], with_info=True, as_supervised=True, ) How I can fix this error? 回答1: According to the Tensorflow Dataset docs the percentage splitting is possible as