I want to count the number of times a word is being repeated in the review string
I am reading the csv file and storing it in a python dataframe using the below line
You can use .str
to use string methods on series of strings:
reviews["review"].str.split("disappointed")
Well, the problem is with:
reviews["review"]
The above is a Series. In your first snippet, you are doing this:
reviews["review"][1].split("disappointed")
That is, you are putting an index for the review. You could try looping over all rows of the column and perform your desired action. For example:
for index, row in reviews.iterrows():
print len(row['review'].split("disappointed"))
pandas 0.20.3 has pandas.Series.str.split() which acts on every string of the series and does the split. So you can simply split and then count the number of splits made
len(reviews['review'].str.split('disappointed')) - 1
pandas.Series.str.split
You're trying to split the entire review column of the data frame (which is the Series mentioned in the error message). What you want to do is apply a function to each row of the data frame, which you can do by calling apply on the data frame:
f = lambda x: len(x["review"].split("disappointed")) -1
reviews["disappointed"] = reviews.apply(f, axis=1)