问题
I wrote a NN model that analyze an image and extract 8 floating numbers at the end. The model is working fine (but slowly) on my computer so I try it on the TPU cloud and there BAM! I have an error:
I1008 12:58:47.077905 140221679261440 tf_logging.py:115] Error recorded from training_loop: File system scheme '[local]' not implemented (file: '/home/gcloud_iba/Data/CGTR/model/GA_subset/model.ckpt-0_temp_e840841d93124a67b54074b1c0fd7ae4') [[{{node save/SaveV2}} = SaveV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT64], _device="/job:worker/replica:0/task:0/device:CPU:0"](save/ShardedFilename, save/SaveV2/tensor_names, save/SaveV2/shape_and_slices, batch_normalization/beta/Read/ReadVariableOp, batch_normalization/beta/Momentum/Read_1/ReadVariableOp, batch_normalization/gamma/Read/ReadVariableOp, batch_normalization/gamma/Momentum/Read_1/ReadVariableOp, batch_normalization/moving_mean/Read/ReadVariableOp, batch_normalization/moving_variance/Read/ReadVariableOp, batch_normalization_1/beta/Read/ReadVariableOp, batch_normalization_1/beta/Momentum/Read_1/ReadVariableOp, batch_normalization_1/gamma/Read/ReadVariableOp, batch_normalization_1/gamma/Momentum/Read_1/ReadVariableOp, batch_normalization_1/moving_mean/Read/ReadVariableOp, batch_normalization_1/moving_variance/Read/ReadVariableOp, conv2d/kernel/Read/ReadVariableOp, conv2d/kernel/Momentum/Read_1/ReadVariableOp, conv2d_1/kernel/Read/ReadVariableOp, conv2d_1/kernel/Momentum/Read_1/ReadVariableOp, conv2d_2/kernel/Read/ReadVariableOp, conv2d_2/kernel/Momentum/Read_1/ReadVariableOp, conv2d_3/kernel/Read/ReadVariableOp, conv2d_3/kernel/Momentum/Read_1/ReadVariableOp, conv2d_4/kernel/Read/ReadVariableOp, conv2d_4/kernel/Momentum/Read_1/ReadVariableOp, conv2d_5/kernel/Read/ReadVariableOp, conv2d_5/kernel/Momentum/Read_1/ReadVariableOp, conv2d_6/kernel/Read/ReadVariableOp, conv2d_6/kernel/Momentum/Read_1/ReadVariableOp, conv2d_7/kernel/Read/ReadVariableOp, conv2d_7/kernel/Momentum/Read_1/ReadVariableOp, conv2d_8/kernel/Read/ReadVariableOp, conv2d_8/kernel/Momentum/Read_1/ReadVariableOp, conv2d_9/kernel/Read/ReadVariableOp, conv2d_9/kernel/Momentum/Read_1/ReadVariableOp, dense/bias/Read/ReadVariableOp, dense/bias/Momentum/Read_1/ReadVariableOp, dense/kernel/Read/ReadVariableOp, dense/kernel/Momentum/Read_1/ReadVariableOp, dense_1/bias/Read/ReadVariableOp, dense_1/bias/Momentum/Read_1/ReadVariableOp, dense_1/kernel/Read/ReadVariableOp, dense_1/kernel/Momentum/Read_1/ReadVariableOp, dense_2/bias/Read/ReadVariableOp, dense_2/bias/Momentum/Read_1/ReadVariableOp, dense_2/kernel/Read/ReadVariableOp, dense_2/kernel/Momentum/Read_1/ReadVariableOp, dense_3/bias/Read/ReadVariableOp, dense_3/bias/Momentum/Read_1/ReadVariableOp, dense_3/kernel/Read/ReadVariableOp, dense_3/kernel/Momentum/Read_1/ReadVariableOp, global_step/Read/ReadVariableOp)]]
I checked that the TPU has access to the hard drive and it works (I have another piece of code that reads the same dataset with another model). I do not cache my data (yet) but I do some prefetching. Aside this, I don't see what isn't working?
Thank you for any help you could provide!
Pi-r
回答1:
The local filesystem is not available on Cloud TPU's. Model directories (checkpoints etc) and input data should be stored in Google Cloud Storage (and prefixed with "gs://").
More details here
https://cloud.google.com/tpu/docs/storage-buckets
回答2:
In absence of Google Cloud Storage, write your model using Keras API (https://keras.io/).
来源:https://stackoverflow.com/questions/52703047/tpu-local-filesystem-doesnt-exist