问题
I'm new to Weka and having problems converting a CSV file containing Tweets into an Arff file.
The CSV looks like this
Tweet,Class
Conference Update: 50% Off Registration to End .. http://t.co/nZtkSzZnJ6,Yes
When I try to convert to .arff using Explorer, I receive the following error "...not recognized as an CSV data files Reason: wrong number of values. Read 1 expected 2, read token[EOF], line 2"
Removing the "%" character allows the file to convert to arff without error. I could remove "%" and other characters but I really don't want to alter my Tweet data. Enclosing in single or double quotes does not help either. Any idea what I am doing wrong?
Appreciate any help
回答1:
Weka may interprete "%" as a begining of comment, and may ignore "%" and rest of that line.
Please enclose entire field ,which contains character "%", with quotation marks (both of single quote "'" and doubel quote '"' work well).
For Example: A csv file which contents following two lines, may be able to convert to Arff file by Weka.
Tweet,Class "Conference Update: 50% Off Registration to End .. http://t.co/nZtkSzZnJ6",Yes
P.S. I'm sorry that my previous answer is incorrect. PRIVIOUS ANSWER (Incorrect answer) was: Try to replace "%" character to "\%". "\" works as escape character, so "\" makes the comment-delimiter character "%" to a normal character "%".
来源:https://stackoverflow.com/questions/24399151/weka-csv-to-arff-special-characters-caue-error