nltk | 易学教程

Extracting sentences using pandas with specific words

阅读更多关于 Extracting sentences using pandas with specific words

问题 I have a excel file with a text column. All I need to do is to extract the sentences from the text column for each row with specific words. I have tried using defining a function. import pandas as pd from nltk.tokenize import sent_tokenize from nltk.tokenize import word_tokenize #################Reading in excel file##################### str_df = pd.read_excel("C:\\Users\\HP\Desktop\\context.xlsx") ################# Defining a function ##################### def sentence_finder(text,word):

Find Most Common Words from a Website in Python 3 [closed]

阅读更多关于 Find Most Common Words from a Website in Python 3 [closed]

问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 6 years ago . Improve this question I need to find and copy those words that appears over 5 times on a given website using Python 3 code and I'm not sure how to do it. I've looked through the archives here on stack overflow but other solutions rely on python 2 code. Here's the measly code I

Find Most Common Words from a Website in Python 3 [closed]

阅读更多关于 Find Most Common Words from a Website in Python 3 [closed]

Finding conditional probability of trigram in python nltk

阅读更多关于 Finding conditional probability of trigram in python nltk

问题 I have started learning NLTK and I am following a tutorial from here, where they find conditional probability using bigrams like this. import nltk from nltk.corpus import brown cfreq_brown_2gram = nltk.ConditionalFreqDist(nltk.bigrams(brown.words())) However I want to find conditional probability using trigrams. When I try to change nltk.bigrams to nltk.trigrams I get the following error. Traceback (most recent call last): File "<stdin>", line 1, in <module> File "home/env/local/lib/python2.7

Finding conditional probability of trigram in python nltk

阅读更多关于 Finding conditional probability of trigram in python nltk

save a dependecy graph in python

阅读更多关于 save a dependecy graph in python

问题 I am using in python3 the stanford dependency parser to parse a sentence, which returns a dependency graph. import pickle from nltk.parse.stanford import StanfordDependencyParser parser = StanfordDependencyParser('stanford-parser-full-2015-12-09/stanford-parser.jar', 'stanford-parser-full-2015-12-09/stanford-parser-3.6.0-models.jar') sentences = ["I am going there","I am asking a question"] with open("save.p","wb") as f: pickle.dump(parser.raw_parse_sents(sentences),f) It gives an error :

save a dependecy graph in python

阅读更多关于 save a dependecy graph in python

Add words to a local copy of WordNet

阅读更多关于 Add words to a local copy of WordNet

问题 I am using WordNet, accessed through Python's NLTK to compare the synsets of words from social media. Many of those words aren't in the version of WordNet that NLTK connects to. When I say I words I mean domain-specific terms, not abbreviations or emoticons. I've compiled a list of these words and would like to merge that list with WordNet. Searching for prior efforts turns up on attempts to develop methods of automatically updating WordNet. The steps I imagine are: Clone the WordNet db Write

Add words to a local copy of WordNet

阅读更多关于 Add words to a local copy of WordNet

Add words to a local copy of WordNet

阅读更多关于 Add words to a local copy of WordNet