extracting relations from text

后端未结

关注

 1  1379

抹茶落季

I want to extract relations from unstructured text in the form of (SUBJECT,OBJECT,ACTION) relations,

for instance,

\"The boy is sitting on the table eating the c

相关标签:

1条回答

粉色の甜心

2021-02-10 16:45

If your sentences do not get much more complicated than the example you have shown (for instance, with respect to anaphoras), the Stanford parser will give good results, based on a probabilistic context-free grammar, that you will easily be able to convert into the format you want. There is a demo available online. For your example, it will give something like

nsubj(sitting, boy)

prep_on(sitting, table)

etc.

If your sentences do get more complicated, you might be interested in trying Boxer, which builds discourse representation structures from C&C parses, based on probabilistic combinatory categorial grammars. Those structures may prove more difficult to adapt to the format you want, but will allow you much more flexibility. There is, again, a demo available online. For your example, it will look something like

sit(x)

boy(y)

table(z)

agent(x,y)

on(x,z)

etc.

The Stanford parser is written in Java and is available under the GPL. C&C is written in C++ and Boxer in SWI Prolog. Those two are not released under a genuinely free licence, but you can obtain the source code, modify it, and use it for any non-commercial project.

Neither will give you a characterisation for the relation between "boy" and "table" in your example—you will need much more powerful semantic reasoning tools for this, and I am not sure whether something like this exists.

Edit

It has now become once more possible to obtain the source code for C&C and Boxer, along with a collection of models.

0 讨论(0)
发布评论:

提交评论
- 加载中...