
tensorflow: Reading time series data from TFRecord

[亡魂溺海] 提交于 2019-12-11 18:17:24
问题 I'm using a SequenceExample protobuf to read/write time-series data into a TFRecord file. I serialized a pair the np arrays as follows: writer = tf.python_io.TFRecordWriter(file_name) context = tf.train.Features( ... Feature( ... ) ... ) feature_data = tf.train.FeatureList(feature=[ tf.train.Feature(float_list=tf.train.FloatList(value= np.random.normal(size=([4065000,]))]) labels = tf.train.FeatureList(feature=[ tf.train.Feature(int64_list=tf.train.Int64List(value= np.random.random_integers(0

Unable to read from Tensorflow tfrecord file

試著忘記壹切 提交于 2019-12-11 12:15:09
问题 I am able to create the tfrecords file by using the below code. def _int64_feature(value): return tf.train.Feature(int64_list=tf.train.Int64List(value=[value])) def _bytes_feature(value): return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value])) def convert_to_tfrecord(images,labels,file_name): # images is a numpy array of shape (num_images,channel,rows,column) # labels is a numpy array of shape (num_images,) num_labels = np.shape(labels) (num_images,depth,rows,cols) = np.shape

Split .tfrecords file into many .tfrecords files

ぐ巨炮叔叔 提交于 2019-12-10 20:15:58
问题 Is there any way to split .tfrecords file into many .tfrecords files directly, without writing back each Dataset example ? 回答1: You can use a function like this: import tensorflow as tf def split_tfrecord(tfrecord_path, split_size): with tf.Graph().as_default(), tf.Session() as sess: ds = batch = ds.make_one_shot_iterator().get_next() part_num = 0 while True: try: records = part_path = tfrecord_path + '.{:03d}'.format

Numpy array to TFrecord

烂漫一生 提交于 2019-12-10 14:54:17
问题 I'm trying to train a custom dataset through tensorflow object detection api. Dataset contains 40k training images and labels which are in numpy ndarray format ( uint8 ). training dataset shape=2 ([40000,23456]) and labels shape = 1 ([0..., 3]). I want to generate tfrecord for this dataset. how do I do that? I'm quit new for tensorflow. 回答1: This tutorial will walk you through the process of creating TFRecords from your data:

How to decode Unicode string in Tensorflow's graph pipeline

左心房为你撑大大i 提交于 2019-12-10 13:35:33
问题 I have created a tfRecord file to store data. I have to store Hindi text so, I have saved it in the bytes using string.encode('utf-8'). But, I am stuck at the time of reading the data. I am reading data with help of tensorflow dataset APIs. I know that i can decode it using string.decode('utf-8'), but this is not what I am looking for. I want some solution through which i can decode my byte string back to Unicode string inside graph only. I have tried as_text, decoding_raw but they are giving

Shuffling tfrecords files

强颜欢笑 提交于 2019-12-06 01:05:34
I have 5 tfrecords files, one for each object. While training I want to read data equally from all the 5 tfrecords i.e. if my batch size is 50, I should get 10 samples from 1st tfrecord file, 10 samples from the second tfrecord file and so on. Currently, it just reads sequentially from all the three files i.e. I get 50 samples from the same record. Is there a way to sample from differnt tfrecords files? I advise you to read the tutorial by @mrry on . On slide 42 he explains how to use to read multiple tfrecord files at the same time. For instance if you

tensorflow ValueError: features should be a dictionary of `Tensor`s. Given type: <class 'tensorflow.python.framework.ops.Tensor'>

依然范特西╮ 提交于 2019-12-03 13:55:15
This is my code! My tensorflow version is 1.6.0, python version is 3.6.4. If I direct use dataset to read csv file, I can train and no wrong. But I convert csv file to tfrecords file, it's wrong. I google it in Internet and almost people say tensorflow should be updated, but it don't work for me. import tensorflow as tf tf.logging.set_verbosity(tf.logging.INFO) feature_names = [ 'SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth' ] def my_input_fn(is_shuffle=False, repeat_count=1): dataset =['csv.tfrecords']) # filename is a list def parser(record): keys_to

Best way to process terabytes of data on gcloud ml-engine with keras

送分小仙女□ 提交于 2019-12-01 10:59:37
I want to train a model on about 2TB of image data on gcloud storage. I saved the image data as separate tfrecords and tried to use the tensorflow data api following this example But it seems like keras' doesn't support validation for tfrecord datasets based on Is there a better approach for processing large amounts of data with keras from ml-engine that I'm missing? Thanks a lot! If you are willing to use tf.keras instead of actual Keras, you can

Best way to process terabytes of data on gcloud ml-engine with keras

别等时光非礼了梦想. 提交于 2019-12-01 09:22:43
问题 I want to train a model on about 2TB of image data on gcloud storage. I saved the image data as separate tfrecords and tried to use the tensorflow data api following this example But it seems like keras' doesn't support validation for tfrecord datasets based on Is there a better approach for processing large amounts of data with keras from ml

TensorFlow strings: what they are and how to work with them

与世无争的帅哥 提交于 2019-11-30 17:49:33
When I read file with tf.read_file I get something with type tf.string . Documentation says only that it is "Variable length byte arrays. Each element of a Tensor is a byte array." ( ). I have no idea how to interpret this. I can do nothing with this type. In usual python you can get elements by index like my_string[:4] , but when I run following code I get an error. import tensorflow as tf import numpy as np x = tf.constant("This is string") y = x[:4] init = tf.initialize_all_variables() sess = tf.Session()