问题
I am using INCEpTION 0.11.0 (https://inception-project.github.io/) to annotate my training data. I would like to use python spacy to use this training data. I could see couple of format in Inception to which I can exported to but I am not sure which one is best suited for spacy.
I could not see any document about converting these exported file to space’s format.
I could write a new script to do this conversion. Before doing that I was wondering is someone already solved this and can give some advice? Which export format I should choose so that it will be easier to convert to spacy’s format?
回答1:
Exporting your data as CONLLU is likely the most straightforward approach. SpaCy can convert CONLLU documents to its expected format using the the converter script: python -m spacy convert /path/to/input/doc.connlu /path/to/output/doc.jsonl -c conllu
.
You'll find that it supports the conversion of CONLL documents, but it isn't immediately obvious which CONLL format is supported. You can try this by playing with the -c
argument above.
来源:https://stackoverflow.com/questions/57840677/export-inception-output-to-spacys-training-input-format