OpenNLP: Training a custom NER Model for multiple entities

早过忘川 提交于 2019-12-05 21:11:23

The type argument to NameFinderME.train is used as the default type for training data that does not include a type parameter. This is only relevant if you have a sample that looks like this:

<START> operating tables <END>

Instead of like this:

<START:item_type> operating tables <END>

To train multiple types of entities, the developer documentation says

A training file can contain multiple types. If the training file contains multiple types the created model will also be able to detect these multiple types. For now its recommended to only train single type models, since multi type support is still experimental.

So you could try training on the sample from your question, which includes multiple types, and see how well it works. In this mailing list message, someone asks for the status of training for multiple types and gets this answer:

The code path itself is stable, the reason we put it there is that it didn't have a good performance on the English data.

Anyway, there performance might highly depend on your data set and the language.

If you don't get good performance with a model that handles multiple types, the alternative would be to create multiple copies of your training data where each copy is modified to include only one type. You would then train a separate model on each set of training data. At that point you should have a (for example) item_type model, a location_type model, and a location_id model. You could then run your input through each model to detect the different types.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!