I am using TokensRegex for rule based entity extraction. It works well but I am having trouble getting my output in the desired format. The following snippet of code gives me an
Answering my own question for those struggling with a similar issue. THe key to getting your output in the correct format lies in how you define your rules in the rules file. Here's what I changed in the rules to change the output:
Old Rule:
{ ruleType: "tokens",
pattern: (([pos:/NNP.*/ | pos:/NN.*/]+) ($LocWords)),
result: Annotate($1, ner, "LOCATION"),
}
New Rule
{ ruleType: "tokens",
pattern: (([pos:/NNP.*/ | pos:/NN.*/]+) ($LocWords)),
action: Annotate($1, ner, "LOCATION"),
result: "LOCATION"
}
How you define your result field defines the output format of your data.
Hope this helps!