nltk

Extracting sentences using pandas with specific words

不问归期 提交于 2021-02-07 10:04:36
问题 I have a excel file with a text column. All I need to do is to extract the sentences from the text column for each row with specific words. I have tried using defining a function. import pandas as pd from nltk.tokenize import sent_tokenize from nltk.tokenize import word_tokenize #################Reading in excel file##################### str_df = pd.read_excel("C:\\Users\\HP\Desktop\\context.xlsx") ################# Defining a function ##################### def sentence_finder(text,word):

Find Most Common Words from a Website in Python 3 [closed]

旧时模样 提交于 2021-02-07 08:15:55
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 6 years ago . Improve this question I need to find and copy those words that appears over 5 times on a given website using Python 3 code and I'm not sure how to do it. I've looked through the archives here on stack overflow but other solutions rely on python 2 code. Here's the measly code I

Find Most Common Words from a Website in Python 3 [closed]

一笑奈何 提交于 2021-02-07 08:12:24
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 6 years ago . Improve this question I need to find and copy those words that appears over 5 times on a given website using Python 3 code and I'm not sure how to do it. I've looked through the archives here on stack overflow but other solutions rely on python 2 code. Here's the measly code I

Finding conditional probability of trigram in python nltk

你离开我真会死。 提交于 2021-02-07 06:26:05
问题 I have started learning NLTK and I am following a tutorial from here, where they find conditional probability using bigrams like this. import nltk from nltk.corpus import brown cfreq_brown_2gram = nltk.ConditionalFreqDist(nltk.bigrams(brown.words())) However I want to find conditional probability using trigrams. When I try to change nltk.bigrams to nltk.trigrams I get the following error. Traceback (most recent call last): File "<stdin>", line 1, in <module> File "home/env/local/lib/python2.7

Finding conditional probability of trigram in python nltk

回眸只為那壹抹淺笑 提交于 2021-02-07 06:25:32
问题 I have started learning NLTK and I am following a tutorial from here, where they find conditional probability using bigrams like this. import nltk from nltk.corpus import brown cfreq_brown_2gram = nltk.ConditionalFreqDist(nltk.bigrams(brown.words())) However I want to find conditional probability using trigrams. When I try to change nltk.bigrams to nltk.trigrams I get the following error. Traceback (most recent call last): File "<stdin>", line 1, in <module> File "home/env/local/lib/python2.7

save a dependecy graph in python

吃可爱长大的小学妹 提交于 2021-02-07 06:03:54
问题 I am using in python3 the stanford dependency parser to parse a sentence, which returns a dependency graph. import pickle from nltk.parse.stanford import StanfordDependencyParser parser = StanfordDependencyParser('stanford-parser-full-2015-12-09/stanford-parser.jar', 'stanford-parser-full-2015-12-09/stanford-parser-3.6.0-models.jar') sentences = ["I am going there","I am asking a question"] with open("save.p","wb") as f: pickle.dump(parser.raw_parse_sents(sentences),f) It gives an error :

save a dependecy graph in python

大憨熊 提交于 2021-02-07 06:03:32
问题 I am using in python3 the stanford dependency parser to parse a sentence, which returns a dependency graph. import pickle from nltk.parse.stanford import StanfordDependencyParser parser = StanfordDependencyParser('stanford-parser-full-2015-12-09/stanford-parser.jar', 'stanford-parser-full-2015-12-09/stanford-parser-3.6.0-models.jar') sentences = ["I am going there","I am asking a question"] with open("save.p","wb") as f: pickle.dump(parser.raw_parse_sents(sentences),f) It gives an error :

Add words to a local copy of WordNet

三世轮回 提交于 2021-02-07 03:26:49
问题 I am using WordNet, accessed through Python's NLTK to compare the synsets of words from social media. Many of those words aren't in the version of WordNet that NLTK connects to. When I say I words I mean domain-specific terms, not abbreviations or emoticons. I've compiled a list of these words and would like to merge that list with WordNet. Searching for prior efforts turns up on attempts to develop methods of automatically updating WordNet. The steps I imagine are: Clone the WordNet db Write

Add words to a local copy of WordNet

六月ゝ 毕业季﹏ 提交于 2021-02-07 03:26:47
问题 I am using WordNet, accessed through Python's NLTK to compare the synsets of words from social media. Many of those words aren't in the version of WordNet that NLTK connects to. When I say I words I mean domain-specific terms, not abbreviations or emoticons. I've compiled a list of these words and would like to merge that list with WordNet. Searching for prior efforts turns up on attempts to develop methods of automatically updating WordNet. The steps I imagine are: Clone the WordNet db Write

Add words to a local copy of WordNet

自古美人都是妖i 提交于 2021-02-07 03:22:21
问题 I am using WordNet, accessed through Python's NLTK to compare the synsets of words from social media. Many of those words aren't in the version of WordNet that NLTK connects to. When I say I words I mean domain-specific terms, not abbreviations or emoticons. I've compiled a list of these words and would like to merge that list with WordNet. Searching for prior efforts turns up on attempts to develop methods of automatically updating WordNet. The steps I imagine are: Clone the WordNet db Write