I am working on my bachelorthesis and have to prepare a corpus to train word embeddings. What I\'m thinking about is if it is possible to check a tokenized sentence or text