How to read quoted text containing escaped quotes

喜你入骨 提交于 2019-12-18 11:30:35

问题


Consider the following comma separated file. For simplicity let it contain one line:


'I am quoted','so, can use comma inside - it is not separator here','but can\'t use escaped quote :=('

If you try to read it with the command

table <- read.csv(filename, header=FALSE)

the line will be separated to 4 parts, because line contains 3 commas. In fact I want to read only 3 parts, one of which contains comma itself. There quote flag comes for help. I tried:

table <- read.csv(filename, header=FALSE, quote="'")

but that falls with error "incomplete final line found by readTableHeader on table". That happens because of odd (seven) number of quotes.

read.table() as well as scan() have parameter allowEscapes, but setting it to TRUE doesn't help. It is ok, cause from help(scan) you can read:

The escapes which are interpreted are the control characters ‘\a, \b, \f, \n, \r, \t, \v’, ... ... Any other escaped character is treated as itself, including backslash

Please suggest how would you read such quoted csv-files, containing escaped \' quotes.


回答1:


One possibility is to use readLines() to get everything read in as is, and then proceed by replacing the quote character by something else, eg :

tt <- readLines("F:/temp/test.txt")
tt <- gsub("([^\\]|^)'","\\1\"",tt) # replace ' by "
tt <- gsub("\\\\","\\",tt) # get rid of the double escape due to readLines

This allows you to read the vector tt in using a textConnection

zz <- textConnection(tt)
read.csv(zz,header=F,quote="\"") # give text input
close(zz)

Not the most beautiful solution, but it works (provided you don't have a " character somewhere in the file off course...)




回答2:


read_delim from package readr can handle escaped quotes, using the arguments escape_double and escape_backslash.

read_delim(file, delim=',', escape_double=FALSE, escape_backslash=TRUE, quote="'")

(Note older versions of readr do not support quoted newlines in CSV headers correctly: https://github.com/tidyverse/readr/issues/784)



来源:https://stackoverflow.com/questions/6032296/how-to-read-quoted-text-containing-escaped-quotes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!