Finding meaningful sub-sentences from a sentence

后端 未结 4 1937
北海茫月
北海茫月 2020-12-13 07:27

Is there a way to to find all the sub-sentences of a sentence that still are meaningful and contain at least one subject, verb, and a predicate/object?

For example,

相关标签:
4条回答
  • 2020-12-13 08:01

    One approach would be with a parser such as a PCFG. Trying to just train a model to detect 'subsentences' is likely to suffer from data sparsity. Also, I am doubtful that you could write down a really clean and unambiguous definition of a subsentence, and if you can't define it, you can't get annotators to annotate for it.

    0 讨论(0)
  • 2020-12-13 08:08

    There's a paper titled "Using Discourse Commitments to Recognize Textual Entailment" by Hickl et al that discusses the extraction of discourse commitments (sub-sentences). The paper includes a description of their algorithm which in some level operates on rules. They used it for RTE, and there may be some minimal levels of deduction in the output. Text simplification maybe a related area to look at.

    0 讨论(0)
  • 2020-12-13 08:13

    You can use dependency parser provided by Stanford CoreNLP. Collapsed output of your sentence will look like below.

    nsubj(going-3, I-1)
    xsubj(do-5, I-1)
    aux(going-3, am-2)
    root(ROOT-0, going-3)
    aux(do-5, to-4)
    xcomp(going-3, do-5)
    det(seminar-7, a-6)
    dobj(do-5, seminar-7)
    prep_on(seminar-7, NLP-9)
    prep_at(do-5, -11)
    prep_in(do-5, Austin-13)
    amod(month-15, next-14)
    tmod(do-5, month-15)
    

    The last 5 of your sentence output are optional. You can remove one or more parts that are not essential to your sentence.
    Most of this optional parts are belong to prepositional and modifier e.g : prep_in, prep_do, advmod, tmod, etc. See Stanford Dependency Manual.

    For example, if you remove all modifier from the output, you will get

    I am going to do a seminar on NLP at SXSW in Austin.

    0 讨论(0)
  • 2020-12-13 08:13

    The following paper http://www.mpi-inf.mpg.de/~rgemulla/publications/delcorro13clausie.pdf processes the dependencies from the Stanford parser and contructs simple clauses (text-simplification).

    See the online demo - https://d5gate.ag5.mpi-sb.mpg.de/ClausIEGate/ClausIEGate

    0 讨论(0)
提交回复
热议问题