How to read corpus of parsed sentences using NLTK in python?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I am working with the BLLIP 1987-89 WSJ Corpus Release 1 ( https://catalog.ldc.upenn.edu/LDC2000T43 ). I am trying to use NLTK's SyntaxCorpusReader class to read in the parsed sentences. I'm trying to get it to work with a simple example of just 1 file. Here is my code... from nltk.corpus.reader import SyntaxCorpusReader path = '/corpus/wsj' filename = 'wsj1' reader = SyntaxCorpusReader('/corpus/wsj','wsj1') I am able to see the raw text from the file. It returns a string of the parsed sentences. reader.raw() u"(S1 (S (PP-LOC (IN In)\n\t(NP