Computer AI algorithm to write sentences?

前端 未结 4 1831
清酒与你
清酒与你 2021-01-30 15:28

I am searching for information on algorithms to process text sentences or to follow a structure when creating sentences that are valid in a normal human language such as English

相关标签:
4条回答
  • 2021-01-30 15:40

    The field you're looking for is called natural language generation, a subfield of natural language processing http://en.wikipedia.org/wiki/Natural_language_processing

    Sentence generation is either really easy or really hard depending on how good you want the sentences to be. Currently, there aren't programs that will be able to generate 100% sensible sentences about given nouns (even with a thesaurus) -- if that is what you mean.

    If, on the other hand, you would be satisfied with nonsense that was sometimes ungrammatical, then you could try an n-gram based sentence generator. These just chain together of words that tend to appear in sequence, and 3-4-gram generators look quite okay sometimes (although you'll recognize them as what generates a lot of spam email).

    Here's an intro to the basics of n-gram based generation, using NLTK: http://www.nltk.org/book/ch02.html#generating-random-text-with-bigrams

    0 讨论(0)
  • 2021-01-30 15:42

    Yes. There is some work dealing with solving problems in NLG with AI techniques. As far as I know, currently, there is no method that you can use for any practical use.

    If you have the background, I suggest getting familiar with some work by Alexander Koller from Saarland University. He describes how to code NLG to PDDL. The main article you'll want to read is "Sentence generating as a planning problem".

    If you do not have any background in NLP, just search for the online courses or course materials by Michael Collings or Dan Jurafsky.

    0 讨论(0)
  • 2021-01-30 15:51

    This is called NLG (Natural Language Generation), although that is mainly the task of generating text that describes a set of data. There is also a lot of research on completely random sentence generation as well.

    One starting point is to use Markov chains to generate sentences. How this is done is that you have a transition matrix that says how likely it is to transition between every every part-of-speech. You also have the most likely starting and ending part-of-speech of a sentence. Put this all together and you can generate likely sequences of parts-of-speech.

    Now, you are far from done, this will first of all not offer a very good result as you are only considering the probability between adjacent words (also called bi-grams), so what you want to do is to extend this to look for instance at the transition matrix between three parts-of-speech (this makes a 3D matrix and gives you trigrams). You can extend it to 4-grams, 5-grams, etc. depending on the processing power and if your corpus can fill such matrix.

    Lastly, you need to patch up things such as object agreement (subject-verb-agreement, adjective-verb-agreement (not in English though), etc.) and tense, so that everything is congruent.

    0 讨论(0)
  • 2021-01-30 16:01

    Writing random sentences is not that hard. Any parser textbook's simple-english-grammar example can be run in reverse to generate grammatically correct nonsense sentences.

    Another way is the word-tuple-random-walk, made popular by the old BYTE magazine TRAVESTY, or stuff like http://www.perlmonks.org/index.pl?node_id=94856

    0 讨论(0)
提交回复
热议问题