I need to train a word2vec representation on tweets using gensim. Unlike most tutorials and code I\'ve seen on gensim my data is not raw, but has already been preprocessed. I ha
I had the same issue. Even converting to array of strings via
>>> arr_str = np.char.mod('%d', arr)
caused an exception when running Word2Vec:
>>> model = Word2Vec(arr_str)
ValueError: The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()
My solution was to write the array of integers as text and then use word2vec with LineSentence.
import numpy as np
from gensim.models import Word2Vec
from gensim.models.word2vec import LineSentence
np.savetxt('train_data.txt', arr, delimiter=" ", fmt="%s")
sentences = LineSentence('train_data.txt')
model = Word2Vec(sentences)