I am creating bag of words representation of the sentence. Then taking the words that exist in the sentence to compare to the file \"vectors.txt\", in order to get their emb
You have a numpy array of strings, not floats. This is what is meant by dtype('<U9')
-- a little endian encoded unicode string with up to 9 characters.
try:
return sum(np.asarray(listOfEmb, dtype=float)) / float(len(listOfEmb))
However, you don't need numpy here at all. You can really just do:
return sum(float(embedding) for embedding in listOfEmb) / len(listOfEmb)
Or if you're really set on using numpy.
return np.asarray(listOfEmb, dtype=float).mean()