Loading Gensim FastText Model with Callbacks Fails

大憨熊 提交于 2020-04-30 11:18:59

问题


After creating a FastText model using Gensim, I want to load it but am running into errors seemingly related to callbacks.

The code used to create the model is

TRAIN_EPOCHS = 30
WINDOW = 5
MIN_COUNT = 50
DIMS = 256

vocab_model = gensim.models.FastText(sentences=model_input,
                                     size=DIMS,
                                     window=WINDOW,
                                     iter=TRAIN_EPOCHS,
                                     workers=6,
                                     min_count=MIN_COUNT,
                                     callbacks=[EpochSaver("./ftchkpts/")])

vocab_model.save('ft_256_min_50_model_30eps')

and the callback EpochSaver is defined as

from gensim.models.callbacks import CallbackAny2Vec

class EpochSaver(CallbackAny2Vec):
    '''Callback to save model after each epoch and show training parameters '''

    def __init__(self, savedir):
        self.savedir = savedir
        self.epoch = 0
        os.makedirs(self.savedir, exist_ok=True)

    def on_epoch_end(self, model):
        savepath = os.path.join(self.savedir, f"ft256_{self.epoch}e")
        model.save(savepath)
        print(f"Epoch saved: {self.epoch + 1}")
        if os.path.isfile(os.path.join(self.savedir, f"ft256_{self.epoch-1}e")):
            os.remove(os.path.join(self.savedir,  f"ft256_{self.epoch-1}e"))
            print("Previous model deleted ")
        self.epoch += 1

Aside from the type of model, this is identical to my process for Word2Vec which worked without issue. However when I open another file and try to load the model with

from gensim.models import FastText
vocab = FastText.load(r'vocab/ft_256_min_50_model_30eps')

I'm greeted with the error

AttributeError: Can't get attribute 'EpochSaver' on <module '__main__'>

What can I do to get the vocabulary to load so I can create the embedding layer for my keras model? If it's relevant, this is happening in JupyterLab.


回答1:


This extra difficulty loading models with custom callbacks is a known, open issue (at least through gensim-3.8.1 and October 2019).

You can see discussions of possible workarounds and fixes there – and the gensim team is considering simply disabling the auto-saving of callbacks at all, requiring them to be re-specified for each later train()/etc call that needs them.

You may be able to load existing models saved with your custom callbacks by importing those same callback classes, as the same names, into the code context where you're doing a load().

You could save callback-free versions of your trained models by blanking the model's callbacks property to its empty default value, just before you save(), eg:

model.callbacks = ()
model.save(save_path)

Then, you wouldn't need to do any special importing of custom classes before a load(). (Of course if you again needed callback functionality on the re-loaded model, they'd then have to be explicitly reestablished after load()).



来源:https://stackoverflow.com/questions/58238043/loading-gensim-fasttext-model-with-callbacks-fails

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!