问题
What can cause loss from model.get_latest_training_loss()
increase on each epoch?
Code, used for training:
class EpochSaver(CallbackAny2Vec):
'''Callback to save model after each epoch and show training parameters '''
def __init__(self, savedir):
self.savedir = savedir
self.epoch = 0
os.makedirs(self.savedir, exist_ok=True)
def on_epoch_end(self, model):
savepath = os.path.join(self.savedir, "model_neg{}_epoch.gz".format(self.epoch))
model.save(savepath)
print(
"Epoch saved: {}".format(self.epoch + 1),
"Start next epoch ... ", sep="\n"
)
if os.path.isfile(os.path.join(self.savedir, "model_neg{}_epoch.gz".format(self.epoch - 1))):
print("Previous model deleted ")
os.remove(os.path.join(self.savedir, "model_neg{}_epoch.gz".format(self.epoch - 1)))
self.epoch += 1
print("Model loss:", model.get_latest_training_loss())
def train():
### Initialize model ###
print("Start training Word2Vec model")
workers = multiprocessing.cpu_count()/2
model = Word2Vec(
DocIter(),
size=300, alpha=0.03, min_alpha=0.00025, iter=20,
min_count=10, hs=0, negative=10, workers=workers,
window=10, callbacks=[EpochSaver("./checkpoints")],
compute_loss=True
)
Output:
Losses from epochs (1 to 20):
Model loss: 745896.8125
Model loss: 1403872.0
Model loss: 2022238.875
Model loss: 2552509.0
Model loss: 3065454.0
Model loss: 3549122.0
Model loss: 4096209.75
Model loss: 4615430.0
Model loss: 5103492.5
Model loss: 5570137.5
Model loss: 5955891.0
Model loss: 6395258.0
Model loss: 6845765.0
Model loss: 7260698.5
Model loss: 7712688.0
Model loss: 8144109.5
Model loss: 8542560.0
Model loss: 8903244.0
Model loss: 9280568.0
Model loss: 9676936.0
What am I doing wrong?
Language arabian. As input from DocIter - list with tokens.
回答1:
Up through gensim 3.6.0, the loss value reported may not be very sensible, only resetting the tally each call to train()
, rather than each internal epoch. There are some fixes forthcoming in this issue:
https://github.com/RaRe-Technologies/gensim/pull/2135
In the meantime, the difference between the previous value, and the latest, may be more meaningful. In that case, your data suggest the 1st epoch had a total loss of 745896, while the last had (9676936-9280568=) 396,368 – which may indicate the kind of progress hoped-for.
回答2:
As proposed by gojomo you can calculate the difference of loss in the callback function:
from gensim.models.callbacks import CallbackAny2Vec
from gensim.models import Word2Vec
# init callback class
class callback(CallbackAny2Vec):
"""
Callback to print loss after each epoch
"""
def __init__(self):
self.epoch = 0
def on_epoch_end(self, model):
loss = model.get_latest_training_loss()
if self.epoch == 0:
print('Loss after epoch {}: {}'.format(self.epoch, loss))
else:
print('Loss after epoch {}: {}'.format(self.epoch, loss- self.loss_previous_step))
self.epoch += 1
self.loss_previous_step = loss
For the training of your model and add computer_loss = True and callbacks=[callback()] in the word2vec train method:
# init word2vec class
w2v_model = Word2Vec(min_count=20,
window=12
size=100,
workers=2)
# build vovab
w2v_model.build_vocab(sentences)
# train the w2v model
w2v_model.train(senteces,
total_examples=w2v_model.corpus_count,
epochs=10,
report_delay=1,
compute_loss = True, # set compute_loss = True
callbacks=[callback()]) # add the callback class
# save the word2vec model
w2v_model.save('word2vec.model')
This will output something like this:
Loss after epoch 0: 4448638.5
Loss after epoch 1: 3283735.5
Loss after epoch 2: 2826198.0
Loss after epoch 3: 2680974.0
Loss after epoch 4: 2601113.0
Loss after epoch 5: 2271333.0
Loss after epoch 6: 2052050.0
Loss after epoch 7: 2011768.0
Loss after epoch 8: 1927454.0
Loss after epoch 9: 1887798.0
来源:https://stackoverflow.com/questions/52038651/loss-does-not-decrease-during-training-word2vec-gensim