tensorflow-datasets | 易学教程

tensorflow pipeline for pickled pandas data input

阅读更多关于 tensorflow pipeline for pickled pandas data input

问题 I would like to input compressed pd.read_pickle(filename, compression='xz') pandas dataframes as a pipeline to tensorflow. I want to use the high level API tf.estimator classifier which requires an input function. My data files are large matrices ~(1400X16) of floats, and each matrix corresponds to a particular type (label). Each type (label) is contained in a different directory, so I know the matrix label from its directory. At the low level, I know I can populate data using a feed_dict={X

FailedPreconditionError: Table already initialized

阅读更多关于 FailedPreconditionError: Table already initialized

问题 I am reading data from tfrecords with dataset api. I am converting string data to dummy data with following code. SFR1 = tf.feature_column.indicator_column( tf.feature_column.categorical_column_with_vocabulary_list("SFR1 ", vocabulary_list=("1", "2"))) But when i run my code, tensorflow is throwing following error. tensorflow.python.framework.errors_impl.FailedPreconditionError: Table already initialized. [[Node: Generator/input_layer/SFR1 _indicator/SFR1 _lookup/hash_table/table_init =

FailedPreconditionError: Table already initialized

阅读更多关于 FailedPreconditionError: Table already initialized

What does experimental in TensorFlow mean?

阅读更多关于 What does experimental in TensorFlow mean?

问题 In TensorFlow 2.0 APIs, there is a module tf.experimental . Such a name also appears in other places like tf.data.experimental . I just would like to know what the motivate for designing these modules is. 回答1: tf.experimental indicates that the said class/method is in early development, incomplete, or less commonly, not up-to-standards. It's a collection of user contributions which weren't yet integrated w/ main TensorFlow, but are still available as a part of open-source for users to test

How to pad to fixed BATCH_SIZE in tf.data.Dataset?

阅读更多关于 How to pad to fixed BATCH_SIZE in tf.data.Dataset?

问题 I have a dataset with 11 samples. And when I choose the BATCH_SIZE be 2, the following code will have errors: dataset = tf.contrib.data.TFRecordDataset(filenames) dataset = dataset.map(parser) if shuffle: dataset = dataset.shuffle(buffer_size=128) dataset = dataset.batch(batch_size) dataset = dataset.repeat(count=1) The problem lies in dataset = dataset.batch(batch_size) , when the Dataset looped into the last batch, the remaining count of samples is just 1, so is there any way to pick

Dataset API for TensorFlow : Variable sized Input

阅读更多关于 Dataset API for TensorFlow : Variable sized Input

问题 I have my entire dataset in memory as list of tuples where each tuple corresponds to a batch of fixed size 'N' . i.e (x[i],label[i],length[i]) x[i]: numpy array of shape [N,W,F]; here there are N examples, with W timestep each; all timesteps have fixed number of features F label[i] : class: shape [N,] one for each example in batch length[i] : length (number of timesteps ) in data : shape [N,] : this is number of timesteps (W) for each example in batch Main problem : Across the batches W

Upgrade to tf.dataset not working properly when parsing csv

阅读更多关于 Upgrade to tf.dataset not working properly when parsing csv

问题 I have a GCMLE experiment and I am trying to upgrade my input_fn to use the new tf.data functionality. I have created the following input_fn based off of this sample def input_fn(...): dataset = tf.data.Dataset.list_files(filenames).shuffle(num_shards) # shuffle up the list of input files dataset = dataset.interleave(lambda filename: # mix together records from cycle_length number of shards tf.data.TextLineDataset(filename).skip(1).map(lambda row: parse_csv(row, hparams)), cycle_length=5) if

Upgrade to tf.dataset not working properly when parsing csv

阅读更多关于 Upgrade to tf.dataset not working properly when parsing csv

How to expand tf.data.Dataset with additional example transformations in Tensorflow

阅读更多关于 How to expand tf.data.Dataset with additional example transformations in Tensorflow

问题 I would like to double the size of an existing dataset I'm using to train a neural network in tensorflow on the fly by adding random noise to it. So when I'm done I'll have all the existing examples and also all the examples with noise added to them. I'd also like to interleave these as I transform them, so they come out in this order: example 1 without noise, example 1 with noise, example 2 without noise, example 2 with noise, etc. I'm struggling to accomplish this using the Dataset api. I

Tensorflow dataset questions about .shuffle, .batch and .repeat

阅读更多关于 Tensorflow dataset questions about .shuffle, .batch and .repeat

问题 I had a question about the use of batch, repeat and shuffle with tf.Dataset. It is not clear to me exactly how repeat and shuffle are used. I understand that .batch will dictate how many training examples will undergo stochastic gradient descent, the uses of .repeat and .shuffle are still not clear to me. First Question Even after reviewing here and here, .repeat is used to reiterate over the dataset once a tf.errors.OutOfRangeError is thrown. Therefore, in my code does that mean I no longer