How to delete a column/columns of a CSV file which has cell values with a string enclosed in double quotes

我怕爱的太早我们不能终老 提交于 2019-12-05 13:28:42

With GNU awk version 4 or later, you can use FPAT to define the patterns.

gawk '{print $1, $2, $3}' FPAT="([^,]+)|(\"[^\"]+\")" OFS="," 44.csv

Test:

$ gawk '{print $1, $2, $3}' FPAT="([^,]+)|(\"[^\"]+\")" OFS="," mycsv.csv
column1, column2, column3
12,455,"string with quotes, and with a comma in between"
4432,6787,"another, string with quotes, and with two comma in between"
11,22,"simple string"

I had the same issue as you Dhruuv, the solution proposed by jaypal singh is correct but wasn't working for all my cases. I recommend you to use : https://github.com/dbro/csvquote (Enables common unix utlities like cut, head, tail to work correctly with csv data containing delimiters and newlines) this worked for me.

You can probably do it with cut in this special case, by using " as your delimiter, but I'd strongly advise against it -- even if you could make it work in this case, you might later get a string with an escaped double quote in it, e.g. \" which would fool that too. Or, more of your columns might be quoted (which is a perfectly valid CSV-ism).

A smarter tool is required! The simplest to obtain might well be Perl and the Text::CSV module -- you've almost certainly got Perl installed, and depending on your environment installing Text::CSV as a package, with CPAN.pm, or with cpanminus ought to be straightforward.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!