Is it possible/legal to somehow encode CR/LF characters into a CSV file?
(as part of a CSV standard?)
If so how should i encode CR/LF?
Mention has been made here of a standard for CSV. I'd be interested to know more about this - the only standards I'm aware of are
whatever excel accepts
the RFC at www.rfc-editor.org/rfc/rfc4180.txt
Yes, you need to wrap in quotes:
"some value
over two lines",some other value
From this document, which is the generally-accepted CSV standard:
A field that contains embedded line-breaks must be surounded by double-quotes
the most common variant of csv out there which is the excel compatible one will allow embedded newlines so long as the field is surrounded by double quotes.
foo,bar,"blah blah
more blah blah",baz
or
foo,bar,"blah blah
more blah blah"
or
"blah blah
more blah blah",baz
are all valid. This mechanism also allows for embedded commas.
Using quotes around textual fields without embedded new lines (or commas) is fine too. If the text itself contains a double quote then mechanism to escape it is to put two together, for example.
foo,bar,"this person said ""blah blah
more blah blah""",baz
Writing a csv reader that handles this correctly can be tricky (especially if you are relying on regular expressions).
I don't think it's part of the standard (if there even is one), but you could use standard C-style escaping, i.e. encode \r\n.
Keep in mind, however, that if you do that you should also encode the escape character -- i.e. \ yields \ after decoding.