tfrecord

tensorflow: Reading time series data from TFRecord

[亡魂溺海] 提交于 2019-12-11 18:17:24
问题 I'm using a SequenceExample protobuf to read/write time-series data into a TFRecord file. I serialized a pair the np arrays as follows: writer = tf.python_io.TFRecordWriter(file_name) context = tf.train.Features( ... Feature( ... ) ... ) feature_data = tf.train.FeatureList(feature=[ tf.train.Feature(float_list=tf.train.FloatList(value= np.random.normal(size=([4065000,]))]) labels = tf.train.FeatureList(feature=[ tf.train.Feature(int64_list=tf.train.Int64List(value= np.random.random_integers(0

Unable to read from Tensorflow tfrecord file

試著忘記壹切 提交于 2019-12-11 12:15:09
问题 I am able to create the tfrecords file by using the below code. def _int64_feature(value): return tf.train.Feature(int64_list=tf.train.Int64List(value=[value])) def _bytes_feature(value): return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value])) def convert_to_tfrecord(images,labels,file_name): # images is a numpy array of shape (num_images,channel,rows,column) # labels is a numpy array of shape (num_images,) num_labels = np.shape(labels) (num_images,depth,rows,cols) = np.shape

Split .tfrecords file into many .tfrecords files

ぐ巨炮叔叔 提交于 2019-12-10 20:15:58
问题 Is there any way to split .tfrecords file into many .tfrecords files directly, without writing back each Dataset example ? 回答1: You can use a function like this: import tensorflow as tf def split_tfrecord(tfrecord_path, split_size): with tf.Graph().as_default(), tf.Session() as sess: ds = tf.data.TFRecordDataset(tfrecord_path).batch(split_size) batch = ds.make_one_shot_iterator().get_next() part_num = 0 while True: try: records = sess.run(batch) part_path = tfrecord_path + '.{:03d}'.format

Numpy array to TFrecord

烂漫一生 提交于 2019-12-10 14:54:17
问题 I'm trying to train a custom dataset through tensorflow object detection api. Dataset contains 40k training images and labels which are in numpy ndarray format ( uint8 ). training dataset shape=2 ([40000,23456]) and labels shape = 1 ([0..., 3]). I want to generate tfrecord for this dataset. how do I do that? I'm quit new for tensorflow. 回答1: This tutorial will walk you through the process of creating TFRecords from your data: https://medium.com/mostly-ai/tensorflow-records-what-they-are-and

How to decode Unicode string in Tensorflow's graph pipeline

左心房为你撑大大i 提交于 2019-12-10 13:35:33
问题 I have created a tfRecord file to store data. I have to store Hindi text so, I have saved it in the bytes using string.encode('utf-8'). But, I am stuck at the time of reading the data. I am reading data with help of tensorflow dataset APIs. I know that i can decode it using string.decode('utf-8'), but this is not what I am looking for. I want some solution through which i can decode my byte string back to Unicode string inside graph only. I have tried as_text, decoding_raw but they are giving

Shuffling tfrecords files

强颜欢笑 提交于 2019-12-06 01:05:34
I have 5 tfrecords files, one for each object. While training I want to read data equally from all the 5 tfrecords i.e. if my batch size is 50, I should get 10 samples from 1st tfrecord file, 10 samples from the second tfrecord file and so on. Currently, it just reads sequentially from all the three files i.e. I get 50 samples from the same record. Is there a way to sample from differnt tfrecords files? I advise you to read the tutorial by @mrry on tf.data . On slide 42 he explains how to use tf.data.Dataset.interleave() to read multiple tfrecord files at the same time. For instance if you

tensorflow ValueError: features should be a dictionary of `Tensor`s. Given type: <class 'tensorflow.python.framework.ops.Tensor'>

依然范特西╮ 提交于 2019-12-03 13:55:15
This is my code! My tensorflow version is 1.6.0, python version is 3.6.4. If I direct use dataset to read csv file, I can train and no wrong. But I convert csv file to tfrecords file, it's wrong. I google it in Internet and almost people say tensorflow should be updated, but it don't work for me. import tensorflow as tf tf.logging.set_verbosity(tf.logging.INFO) feature_names = [ 'SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth' ] def my_input_fn(is_shuffle=False, repeat_count=1): dataset = tf.data.TFRecordDataset(['csv.tfrecords']) # filename is a list def parser(record): keys_to

Best way to process terabytes of data on gcloud ml-engine with keras

送分小仙女□ 提交于 2019-12-01 10:59:37
I want to train a model on about 2TB of image data on gcloud storage. I saved the image data as separate tfrecords and tried to use the tensorflow data api following this example https://medium.com/@moritzkrger/speeding-up-keras-with-tfrecord-datasets-5464f9836c36 But it seems like keras' model.fit(...) doesn't support validation for tfrecord datasets based on https://github.com/keras-team/keras/pull/8388 Is there a better approach for processing large amounts of data with keras from ml-engine that I'm missing? Thanks a lot! If you are willing to use tf.keras instead of actual Keras, you can

Best way to process terabytes of data on gcloud ml-engine with keras

别等时光非礼了梦想. 提交于 2019-12-01 09:22:43
问题 I want to train a model on about 2TB of image data on gcloud storage. I saved the image data as separate tfrecords and tried to use the tensorflow data api following this example https://medium.com/@moritzkrger/speeding-up-keras-with-tfrecord-datasets-5464f9836c36 But it seems like keras' model.fit(...) doesn't support validation for tfrecord datasets based on https://github.com/keras-team/keras/pull/8388 Is there a better approach for processing large amounts of data with keras from ml

TensorFlow strings: what they are and how to work with them

与世无争的帅哥 提交于 2019-11-30 17:49:33
When I read file with tf.read_file I get something with type tf.string . Documentation says only that it is "Variable length byte arrays. Each element of a Tensor is a byte array." ( https://www.tensorflow.org/versions/r0.10/resources/dims_types.html ). I have no idea how to interpret this. I can do nothing with this type. In usual python you can get elements by index like my_string[:4] , but when I run following code I get an error. import tensorflow as tf import numpy as np x = tf.constant("This is string") y = x[:4] init = tf.initialize_all_variables() sess = tf.Session() sess.run(init)