I have corpus in txt extention with like this below format:
Mike NNP B-PERSON Noah NNP I-PERSON eats VB O donuts NN O Sarah NNP B-PERSON Larsson NNP I-PERSON