问题
After creating a FastText model using Gensim, I want to load it but am running into errors seemingly related to callbacks.
The code used to create the model is
TRAIN_EPOCHS = 30
WINDOW = 5
MIN_COUNT = 50
DIMS = 256
vocab_model = gensim.models.FastText(sentences=model_input,
size=DIMS,
window=WINDOW,
iter=TRAIN_EPOCHS,
workers=6,
min_count=MIN_COUNT,
callbacks=[EpochSaver("./ftchkpts/")])
vocab_model.save('ft_256_min_50_model_30eps')
and the callback EpochSaver
is defined as
from gensim.models.callbacks import CallbackAny2Vec
class EpochSaver(CallbackAny2Vec):
'''Callback to save model after each epoch and show training parameters '''
def __init__(self, savedir):
self.savedir = savedir
self.epoch = 0
os.makedirs(self.savedir, exist_ok=True)
def on_epoch_end(self, model):
savepath = os.path.join(self.savedir, f"ft256_{self.epoch}e")
model.save(savepath)
print(f"Epoch saved: {self.epoch + 1}")
if os.path.isfile(os.path.join(self.savedir, f"ft256_{self.epoch-1}e")):
os.remove(os.path.join(self.savedir, f"ft256_{self.epoch-1}e"))
print("Previous model deleted ")
self.epoch += 1
Aside from the type of model, this is identical to my process for Word2Vec which worked without issue. However when I open another file and try to load the model with
from gensim.models import FastText
vocab = FastText.load(r'vocab/ft_256_min_50_model_30eps')
I'm greeted with the error
AttributeError: Can't get attribute 'EpochSaver' on <module '__main__'>
What can I do to get the vocabulary to load so I can create the embedding layer for my keras model? If it's relevant, this is happening in JupyterLab.
回答1:
This extra difficulty loading models with custom callbacks is a known, open issue (at least through gensim-3.8.1
and October 2019).
You can see discussions of possible workarounds and fixes there – and the gensim team is considering simply disabling the auto-saving of callbacks at all, requiring them to be re-specified for each later train()
/etc call that needs them.
You may be able to load existing models saved with your custom callbacks by importing those same callback classes, as the same names, into the code context where you're doing a load()
.
You could save callback-free versions of your trained models by blanking the model's callbacks
property to its empty default value, just before you save()
, eg:
model.callbacks = ()
model.save(save_path)
Then, you wouldn't need to do any special importing of custom classes before a load()
. (Of course if you again needed callback functionality on the re-loaded model, they'd then have to be explicitly reestablished after load()
).
来源:https://stackoverflow.com/questions/58238043/loading-gensim-fasttext-model-with-callbacks-fails