I am trying to recognize entities in a set of OCR texts from images of documents. Since the text is commonly in the form some_label: value in the document, it c
some_label: value