initialising Seq2seq embedding with pretrained word2vec

前端 未结 2 563
野的像风
野的像风 2021-02-09 03:43

I am interested in initialising tensorflow seq2seq implementation with pretrained word2vec.

I have seen the code. It seems embedding is initialized

with          


        
2条回答
  •  庸人自扰
    2021-02-09 04:06

    You can change the tokanizer present in tensorflow/models/rnn/translate/data_utils.py to use a pre-trained word2vec model for tokenizing. The lines 187-190 of data_utils.py:

    if tokenizer:
        words = tokenizer(sentence)
    else:
        words = basic_tokenizer(sentence)
    

    use basic_tokenizer. You can write a tokenizer method that uses a pre-trained word2vec model for tokenizing the sentences.

提交回复
热议问题