问题
Hi I am trying to do a sentiment analysis using Naive Bayes classifier in python 2.x. It reads the sentiment using a txt file and then gives output as positive or negative based on the sample txt file sentiments. I want the output the same form as input e.g. I have a text file of lets sat 1000 raw sentiments and I want the output to show positive or negative against each sentiment. Please help. Below is the code i am using
import math
import string
def Naive_Bayes_Classifier(positive, negative, total_negative, total_positive, test_string):
y_values = [0,1]
prob_values = [None, None]
for y_value in y_values:
posterior_prob = 1.0
for word in test_string.split():
word = word.lower().translate(None,string.punctuation).strip()
if y_value == 0:
if word not in negative:
posterior_prob *= 0.0
else:
posterior_prob *= negative[word]
else:
if word not in positive:
posterior_prob *= 0.0
else:
posterior_prob *= positive[word]
if y_value == 0:
prob_values[y_value] = posterior_prob * float(total_negative) / (total_negative + total_positive)
else:
prob_values[y_value] = posterior_prob * float(total_positive) / (total_negative + total_positive)
total_prob_values = 0
for i in prob_values:
total_prob_values += i
for i in range(0,len(prob_values)):
prob_values[i] = float(prob_values[i]) / total_prob_values
print prob_values
if prob_values[0] > prob_values[1]:
return 0
else:
return 1
if __name__ == '__main__':
sentiment = open(r'C:/Users/documents/sample.txt')
#Preprocessing of training set
vocabulary = {}
positive = {}
negative = {}
training_set = []
TOTAL_WORDS = 0
total_negative = 0
total_positive = 0
for line in sentiment:
words = line.split()
y = words[-1].strip()
y = int(y)
if y == 0:
total_negative += 1
else:
total_positive += 1
for word in words:
word = word.lower().translate(None,string.punctuation).strip()
if word not in vocabulary and word.isdigit() is False:
vocabulary[word] = 1
TOTAL_WORDS += 1
elif word in vocabulary:
vocabulary[word] += 1
TOTAL_WORDS += 1
#Training
if y == 0:
if word not in negative:
negative[word] = 1
else:
negative[word] += 1
else:
if word not in positive:
positive[word] = 1
else:
positive[word] += 1
for word in vocabulary.keys():
vocabulary[word] = float(vocabulary[word])/TOTAL_WORDS
for word in positive.keys():
positive[word] = float(positive[word])/total_positive
for word in negative.keys():
negative[word] = float(negative[word])/total_negative
test_string = raw_input("Enter the review: \n")
classifier = Naive_Bayes_Classifier(positive, negative, total_negative, total_positive, test_string)
if classifier == 0:
print "Negative review"
else:
print "Positive review"
回答1:
I've checked the github repo posted by you in comments. I tried to run the project, but I have some errors.
Anyway, I've checked the project structure and the file used to training the naive bayes algorithm, and I think that the following piece of code can be used to write your result data in a Excel file (i.e. .xls)
with open("test11.txt") as f:
for line in f:
classifier = naive_bayes_classifier(positive, negative, total_negative, total_positive, line)
result = 'Positive' if classifier == 0 else 'Negative'
data_to_be_written += ([line, result],)
# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('test.xls')
worksheet = workbook.add_worksheet()
# Start from the first cell. Rows and columns are zero indexed.
row = 0
col = 0
# Iterate over the data and write it out row by row.
for item, cost in data_to_be_written:
worksheet.write(row, col, item)
worksheet.write(row, col + 1, cost)
row += 1
workbook.close()
Sorthly, for each row of the file with the sentences to be tested, I call the classifier and prepare a structure that will be written in the csv file.
Then loop the structure and write the xls file.
To do this I have used a python site package called xlsxwriter.
As I told you before, I have some problem to run the project, so this code is not tested as well. It should be works well, bu anyway, if you are in trouble, let me know.
Regards
回答2:
> with open("test11.txt") as f:
> for line in f:
> classifier = Naive_Bayes_Classifier(positive, negative, total_negative, total_positive, line) if classifier == 0:
> f.write(line + 'Negative') else:
> f.write(line + 'Positive')
>
> # result = 'Positive' if classifier == 0 else 'Negative'
> # data_to_be_written += ([line, result],)
>
> # Create a workbook and add a worksheet. workbook = xlsxwriter.Workbook('test.xls') worksheet = workbook.add_worksheet()
>
> # Start from the first cell. Rows and columns are zero indexed. row = 0 col = 0
>
> # Iterate over the data and write it out row by row. for item, cost in f: worksheet.write(row, col, item) worksheet.write(row, col +
> 1, cost) row += 1
>
> workbook.close()
来源:https://stackoverflow.com/questions/43779723/text-analysis-unable-to-write-output-of-python-program-in-csv-or-xls-file