This is an extension to a related question answered Here
I have a weekly csv file which needs to be parsed. it looks like this.
\"asdf\",\"asdf\",\"as
I was using VIM to remove nested quotes in a .CSV file and this worked for me:
"[^,"][^"]*"[^,]
(?<!^|,)"(?!,|$)
will match a double quote that is not preceded or followed by a comma nor situated at start/end of line.
If you need to allow whitespace around the commas or at start/end-of-line, and if your regex flavor (which you didn't specify) allows arbitrary-length lookbehind (.NET does, for example), you can use
(?<!^\s*|,\s*)"(?!\s*,|\s*$)
In vim I used this to remove all the unescaped quotes.
:%s/\v("(,")@!)&((",)@<!")&("(\n)@!)&(^@<!")//gc
detailed explanation is,
: - start the vim command
% - scope of the command is the whole file
s - search and replace
/ - start of search pattern
\v - simple regex syntax (rather than vim style)
(
" - double quote
(,") - comma_quote
@! - not followed by
)
& - and
(
(",) - quote_comma
@<!- does not precedes
" - double quote
)
& - and
(
" - double quote
(\n) - line end
@! - not followed by
)
& - and
(
^ - line beginning
@<! - does not precedes
" - double quote
)
/ - end of search pattern and start of replace pattern
- replace with nothing (delete)
/ - end of replace pattern
g - apply to all the matches
c - confirm with user for every replacement
this does the job fairly quickly. The only instance this fails is when there are instances of "," pattern in the data.