Stanford Parser and NLTK

后端 未结 18 2337
既然无缘
既然无缘 2020-11-22 01:32

Is it possible to use Stanford Parser in NLTK? (I am not talking about Stanford POS.)

18条回答
  •  清酒与你
    2020-11-22 02:15

    Note that this answer applies to NLTK v 3.0, and not to more recent versions.

    Sure, try the following in Python:

    import os
    from nltk.parse import stanford
    os.environ['STANFORD_PARSER'] = '/path/to/standford/jars'
    os.environ['STANFORD_MODELS'] = '/path/to/standford/jars'
    
    parser = stanford.StanfordParser(model_path="/location/of/the/englishPCFG.ser.gz")
    sentences = parser.raw_parse_sents(("Hello, My name is Melroy.", "What is your name?"))
    print sentences
    
    # GUI
    for line in sentences:
        for sentence in line:
            sentence.draw()
    

    Output:

    [Tree('ROOT', [Tree('S', [Tree('INTJ', [Tree('UH', ['Hello'])]), Tree(',', [',']), Tree('NP', [Tree('PRP$', ['My']), Tree('NN', ['name'])]), Tree('VP', [Tree('VBZ', ['is']), Tree('ADJP', [Tree('JJ', ['Melroy'])])]), Tree('.', ['.'])])]), Tree('ROOT', [Tree('SBARQ', [Tree('WHNP', [Tree('WP', ['What'])]), Tree('SQ', [Tree('VBZ', ['is']), Tree('NP', [Tree('PRP$', ['your']), Tree('NN', ['name'])])]), Tree('.', ['?'])])])]

    Note 1: In this example both the parser & model jars are in the same folder.

    Note 2:

    • File name of stanford parser is: stanford-parser.jar
    • File name of stanford models is: stanford-parser-x.x.x-models.jar

    Note 3: The englishPCFG.ser.gz file can be found inside the models.jar file (/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz). Please use come archive manager to 'unzip' the models.jar file.

    Note 4: Be sure you are using Java JRE (Runtime Environment) 1.8 also known as Oracle JDK 8. Otherwise you will get: Unsupported major.minor version 52.0.

    Installation

    1. Download NLTK v3 from: https://github.com/nltk/nltk. And install NLTK:

      sudo python setup.py install

    2. You can use the NLTK downloader to get Stanford Parser, using Python:

      import nltk
      nltk.download()
      
    3. Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)

    OR:

    1. Download and install NLTK v3, same as above.

    2. Download the latest version from (current version filename is stanford-parser-full-2015-01-29.zip): http://nlp.stanford.edu/software/lex-parser.shtml#Download

    3. Extract the standford-parser-full-20xx-xx-xx.zip.

    4. Create a new folder ('jars' in my example). Place the extracted files into this jar folder: stanford-parser-3.x.x-models.jar and stanford-parser.jar.

      As shown above you can use the environment variables (STANFORD_PARSER & STANFORD_MODELS) to point to this 'jars' folder. I'm using Linux, so if you use Windows please use something like: C://folder//jars.

    5. Open the stanford-parser-3.x.x-models.jar using an Archive manager (7zip).

    6. Browse inside the jar file; edu/stanford/nlp/models/lexparser. Again, extract the file called 'englishPCFG.ser.gz'. Remember the location where you extract this ser.gz file.

    7. When creating a StanfordParser instance, you can provide the model path as parameter. This is the complete path to the model, in our case /location/of/englishPCFG.ser.gz.

    8. Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)

提交回复
热议问题