问题
When running a train on an empty NER model, should I include only labeled data (data that contain necessarily at least one entity), or should I also include data that do not contain any label at all (in this case, teaching the model that in some circunstances these words do not have any label)?
回答1:
If you look at the commonly used training data for NER (you can find links at http://nlpprogress.com/english/named_entity_recognition.html ), you’ll see that most/every example has at least one entity.
Despite that, the model probably learns that most entity types don’t show up in any given example. But you can always try adding examples of true negatives and see if that helps
来源:https://stackoverflow.com/questions/58962469/ner-training-using-spacy