问题
I have a column named word_count which contains the count of all the words in a review. How can I find the number of times the word awesome has occurred in each row of that column and use .apply() method to make it into a new column say awesome.
products['word_count'][1]
{'and': 3L,'bags': 1L,'came': 1L, 'disappointed.':1L,'does':1L,'early':1L,'highly': 1L,'holder.': 1L, 'awesome': 2L}
how can i get the output
products['awesome'][1]
2
回答1:
What I understood from you is that you have a dictionary called products which holds word counter for various texts like this:
products = {'word_count' : [{'holder.': 2, 'awesome': 1}, {'and': 3,'bags': 1,'came': 1, 'disappointed.':1,'does':1,'early':1,'highly': 1,'holder.': 1, 'awesome': 2}] }
for instance, the first text contains "holder" 2 times and awesome 1 time. To add another column you need to create the array that counts 'awesome' on each text as follows:
counter = []
for i in range(len(products['word_count'])):
counter.append(products['word_count'][i]['awesome'])
and then add the row to the table:
products['awesome'] = counter
and there you have it!
回答2:
Here's the code for the python function counting_words:
def counting_words(x):
if (products['word_count'][x].has_key('awesome')):
return products['word_count'][x]['awesome']
else:
return 0
Here's the other part of the code
new_dict = {}
for x in range(len(products)):
if (x==0):
new_dict['awesome'] = [counting_words(x)]
new_dict['awesome'].append(counting_words(x))
newframe = graphlab.SFrame(new_dict)
products.add_columns(newframe)
I assumed that you are using graphlab and the above code will work for the word 'awesome'. The new_dict was created to store the count of 'awesome' in each row of your product['word_count'] column. So in new_dict it should be: new_dict = {'awesome': [0,0,1,...2,1]}. However, if you plan to count other words, this method would be too slow.
来源:https://stackoverflow.com/questions/33068658/how-to-count-the-number-of-occurrence-of-a-word-in-a-column