Suppose I have a csv file looks like this:
Type,ID,NAME,CONTENT,RESPONSE,GRADE,SOURCE
A,3,\"\",\"I have comma, ha!\",I have open double quotes\",A,\"\"
fread
from data.table
handles this just fine:
library(data.table)
fread('Type,ID,NAME,CONTENT,RESPONSE,GRADE,SOURCE
A,3,"","I have comma, ha!",I have open double quotes",A,""')
# Type ID NAME CONTENT RESPONSE GRADE SOURCE
#1: A 3 I have comma, ha! I have open double quotes" A
I'm not too sure about the structure of CSV files, but you said the author had escaped the comma in the text under content.
This works to read the text as is with the "
at the end.
read.csv2("Test.csv", header = T,sep = ",", quote="")
This is not valid CSV, so you'll have to do your own parsing. But, assuming the convention is as follows, you can just toggle with scan
to take advantage of most of its abilities:
next_field<-function(stream) {
p<-seek(stream)
d<-readChar(stream,1)
seek(stream,p)
if(d=="\"")
field<-scan(stream,"",1,sep=",",quote="\"",blank=FALSE)
else
field<-scan(stream,"",1,sep=",",quote="",blank=FALSE)
return(field)
}
Assuming the above convention, this sufficient to parse as follows
s<-file("example.csv",open="rt")
header<-readLines(s,1)
header<-scan(what="",text=header,sep=",")
line<-replicate(length(header),next_field(s))
setNames(as.data.frame(lapply(line,type.convert)),header)
Type ID NAME CONTENT RESPONSE GRADE SOURCE 1 A 3 NA I have comma, ha! I have open double quotes" A NA
However, in practice you might want to first write back the fields, quoting each, to another file, so you can just read.csv
on the corrected format.