问题
I am trying to train and make predictions with an LSTM model using tf.keras. I have written code in two different files, LSTMTraining.py which trains the Keras Model (and save it to a file), and Predict.py, which is supposed to load in the Keras model and use it to make predictions. For some reason, when I load the model in Predict.py, it starts training, even though I have not used the model.fit() command in that file. Why is this happening?
I have saved the model into multiple different file formats. For example, I've tried saving the model's architecture into a JSON file (using model_to_json()), and saving the weights seperately, then loading both of these files in seperately and then combining them. I've also tried saving them together into one file (using model.save()), and loading that in.
Creating and Training Model in LSTMTraining.py (Note: the log_similarity_loss was just a custom loss function I created for the model):
# Machine learning
import tensorflow as tf
from tensorflow.python.keras import layers
import numpy as np
# Load/save data
import pickle
import os
# Shuffling
from sklearn.utils import shuffle
# Parameters
epochs = 5
display_step = 1000
n_input = 5
wordvec_len = 5
n_hidden = 512
recurrent_dropout = 0
dropout = 0
# Load data
with open("Vectorized_Word_By_Word.txt", "rb") as data:
vectorized_txt = pickle.load(data)
# Prepare data into format for training (x: [prev-words], y: [next-word])
x_train, y_train = [], []
for n in range(0, len(vectorized_txt) - n_input - 1):
prev_words = vectorized_txt[n: n+5]
next_word = vectorized_txt[n+6]
x_train.append(prev_words)
y_train.append(next_word)
x_train, y_train = np.array(x_train), np.array(y_train)
x_train, y_train = shuffle(x_train, y_train, random_state=0)
def log_similarity_loss(y_actual, y_pred):
"""Log similarity loss calculation."""
cos_similarity = tf.keras.losses.CosineSimilarity(axis=0)(y_actual, y_pred)
scaled_similarity = tf.add(tf.multiply(0.5, cos_similarity), 0.5)
return -0.5*tf.math.log(scaled_similarity)
log_similarity_loss(
[0.05, 0.01, 0.05, 1.2], [0.05, -0.01, 0.05, -1.2])
model = tf.keras.Sequential([
layers.LSTM(n_hidden, input_shape=(n_input, wordvec_len),
dropout=dropout, recurrent_dropout=recurrent_dropout,
return_sequences=True),
layers.LSTM(n_hidden, dropout=dropout,
recurrent_dropout=recurrent_dropout),
layers.Dense(wordvec_len)
])
model.compile(loss=log_similarity_loss,
optimizer='adam', metrics=['cosine_proximity'])
model.fit(x_train, y_train, epochs=epochs, batch_size=12)
model.save("Keras_Model.h5", include_optimizer=True, save_format='h5')
# Save model weights and architecture
model.save_weights('model_weights.h5')
with open("model_architecture.json", "w") as json_file:
json_file.write(model.to_json())
Loading in the model in Predict.py (Note: All the functions imported from "WordModel.py" are just functions for text processing I've written that are unrelated to Keras):
from WordModel import word_by_word, word_to_vec, vec_to_word
import gensim
import tensorflow as tf
from tensorflow.python.keras.models import load_model, model_from_json
with open('model_architecture.json', 'r') as json_file:
model_json = json_file.read()
keras_model = model_from_json(model_json)
keras_model.load_weights("model_weights.h5")
I was expecting no output, just the model to be loaded. However, I got the verbose training output of the model like so (when running Predict.py):
12/1212 [..............................] - ETA: 3:32 - loss: 0.2656 - cosine_proximity: 0.0420
24/1212 [..............................] - ETA: 1:55 - loss: 0.2712 - cosine_proximity: 0.2066
36/1212 [..............................] - ETA: 1:24 - loss: 0.2703 - cosine_proximity: 0.2294
48/1212 [>.............................] - ETA: 1:08 - loss: 0.2394 - cosine_proximity: 0.2690
60/1212 [>.............................] - ETA: 58s - loss: 0.2286 - cosine_proximity: 0.2874
72/1212 [>.............................] - ETA: 52s - loss: 0.2247 - cosine_proximity: 0.2750
84/1212 [=>............................] - ETA: 47s - loss: 0.2115 - cosine_proximity: 0.2924
and so on.
Note that I have not made any training command in my Predict.py file. I have rerun the code multiple times, and made sure that I was running the correct file. Still, nothing seems to be working.
Thanks for the help!
回答1:
The problem's likely with your VSCode IDE, which takes additional configuring to work both with Python and its packages -- when you run one script, you may be running all the scripts, thus the seen behavior. A solution I'd recommend is switching to Spyder and installing your packages with Anaconda. Once you've installed both, search "anaconda command prompt" or "anaconda powershell" on your PC, and in the terminal, type:
conda update conda
conda update --all
conda install numpy # optional (sort of)
conda install matplotlib # optional (sort of)
# SEE BELOW
conda install -c conda-forge keras
conda update --all # final 'cleanup' command - ensures package compatibility
If you plan on using a GPU (highly recommended), you'll need to first download CUDA - instructions here (get CUDA 10 instead of 9 in the article). Then run conda install tensorflow-gpu
as in the article.
Then, in Spyder: Tools -> Preferences -> PYTHONPATH manager
-> add all folders of the modules/data you plan to use, so you don't have to %cd
each time or worry about relative pathing and can import directly. Lastly, make sure Anaconda & Spyder use the right Python interpreter.
Restart Spyder, run scripts - assuming no bugs, all should be well.
来源:https://stackoverflow.com/questions/58053509/why-does-my-keras-model-train-after-i-load-it-even-though-i-have-not-actually-s