问题
I fine-tuned a BERT model from Tensorflow hub to build a simple sentiment analyzer. The model trains and runs fine. On export, I simply used:
tf.saved_model.save(model, export_dir='models')
And this works just fine.. until I reboot.
On a reboot, the model no longer loads. I've tried using a Keras loader as well as the Tensorflow Server, and I get the same error.
I get the following error message:
Not found: /tmp/tfhub_modules/09bd4e665682e6f03bc72fbcff7a68bf879910e/assets/vocab.txt; No such file or directory
The model is trying to load assets from the tfhub modules cache, which is wiped by reboot. I know I could persist the cache, but I don't want to do that because I want to be able to generate models and then copy them over to a separate application without worrying about the cache.
The crux of it is that I don't think it's necessary to look in the cache for the assets at all. The model was saved with an assets folder wherein vocab.txt
was generated, so in order to find the assets it just needs to look in its own assets folder (I think). However, it doesn't seem to be doing that.
Is there any way to change this behaviour?
Added the code for building and exporting the model (it's not a clever model, just prototyping my workflow):
bert_model_name = "bert_en_uncased_L-12_H-768_A-12"
BATCH_SIZE = 64
EPOCHS = 1 # Initial
def build_bert_model(bert_model_name):
input_layer = tf.keras.layers.Input(shape=(), dtype=tf.string, name="inputs")
preprocessing_layer = hub.KerasLayer(
map_model_to_preprocess[bert_model_name], name="preprocessing"
)
encoder_inputs = preprocessing_layer(input_layer)
bert_model = hub.KerasLayer(
map_name_to_handle[bert_model_name], name="BERT_encoder"
)
outputs = bert_model(encoder_inputs)
net = outputs["pooled_output"]
net = tf.keras.layers.Dropout(0.1)(net)
net = tf.keras.layers.Dense(1, activation=None, name="classifier")(net)
return tf.keras.Model(input_layer, net)
def main():
train_ds, val_ds = load_sentiment140(batch_size=BATCH_SIZE, epochs=EPOCHS)
steps_per_epoch = tf.data.experimental.cardinality(train_ds).numpy()
init_lr = 3e-5
optimizer = tf.keras.optimizers.Adam(learning_rate=init_lr)
model = build_bert_model(bert_model_name)
model.compile(optimizer=optimizer, loss='mse', metrics='mse')
model.fit(train_ds, validation_data=val_ds, steps_per_epoch=steps_per_epoch)
tf.saved_model.save(model, export_dir='models')
回答1:
This problem comes from a TensorFlow bug triggered by versions /1 and /2 of https://tfhub.dev/tensorflow/bert_en_uncased_preprocess. The updated models tensorflow/bert_*_preprocess/3 (released last Friday) avoid this bug. Please update to the newest version.
The Classify Text with BERT tutorial has been updated accordingly.
Thanks for bringing this up!
来源:https://stackoverflow.com/questions/65511729/tensorflow-savedmodel-ignoring-assets-file-on-load